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INTELLIGENCE OF CHINESE AND JAPANESE 
CHILDREN 


PETER SANDIFORD AND RUBY KERR! 


University of Toronto 


During the winter of 1924-25 an extensive survey of the educa- 
tional system of British Columbia was made under the direction of 
Dr. J. H. Putman and Dr. G. M. Weir. The writer was called in 
to conduct the testing programme which formed part of the survey. 
In Vancouver and neighborhood a large number of Oriental immi- 
grants reside, especially Chinese and Japanese. The survey provided 
an excellent opportunity for measuring the intelligence of children of 
these alien groups and after a preliminary survey of the field, it was 
decided to give the Pintner-Paterson ‘“‘Scale of Performance Tests’’ 
to an unselected group of Chinese and Japanese pupils attending the 
Public Elementary Schools of Vancouver. Miss Ruby Kerr and her 
staff at the Vancouver Psychological Clinic readily consented to 
cooperate in the undertaking. Detailed directions for administering 
the tests were drawn up in accordance with the scheme laid down by 
Pintner and Paterson. Finally, the persons who gave the tests were 
given a preliminary training to secure uniformity of procedure and 
technique. 

The administration of the tests proved more arduous than was 
anticipated. In practice it was found that an examiner, working to 
capacity, could only secure from six to eight results per day. Eventu- 
ally, however, the 500 records which form the basis of the present study 
were obtained. 

The present position of intelligence testing in regard to alien groups, 
both foreign-born and children of foreign-born, is somewhat unsatis- 


‘This study was planned by Peter Sandiford, the writer of the paper, and 


carried out by Miss Ruby Kerr of the Vancouver Psychological Clinic and her 
staff. 
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factory. Especially is this true of the Chinese and Japanese children, 
There is the ever-present problem of language affecting the results and 
the factor of selection plays an important (though undetermined) réle. 

Pyle’s early study showed the Chinese children to be superior to 
Americans in rote memory, but inferior in working out logical relation. 
ships and in speed of learning. Combining the averages of his various 
tests, the efficiency of the Chinese boys was found to be about 9% 
per cent of that of American boys, while the efficiency of Chines 
girls was only 77 per cent of that of American girls. Pyle’s study wags 
made before intelligence testing had made much headway. He used 
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Dr1AGRAM 1.——-Showing the range of the middle half of the 1Q’s of Japanese and Chinese 
pupils in the Vancouver Public Schools. 


tests for rote memory, logical memory, substitution, analogues, and 
the spot pattern test; and the subjects employed Chinese characters 
in writing answers to the questions. 

Yeung tested 105 unselected Chinese children (mainly from 9 to 
11 years of age) in the Oriental school of San Francisco and obtained 
a median IQ of 97. This compares favorably with unselected white 
children of the same city providing an allowance is made for the 
language handicap. 

Using the Binet, Beta, and Stanford Achievement Tests, Darsie 
found that Japanese children resident in California and ranging in 
age from 10 to 15 years were little, if any, below the standards of 
California white children. The 1Q’s of urban children gave a median 
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of 90; of children in large cities, 99. The Beta results, which were 
ynaffected by the factor of language, were above those of white children. 
The educational quotient (95) on the Stanford Achievement test is 
also indicative of superiority. 

In the Vancouver Survey the method of scoring first used was that 
of the Year Scale. This gave a mental age from which the IQ’s were 
calculated. ‘Table I gives the distribution of these according to race 
and sex. Diagram 1 shows the results of Table I in graphical form. 


TasLE I—DISTRIBUTION OF THE IQ’s OF JAPANESE AND CHINESE PUPILS IN 
VANCOUVER PuBLIC SCHOOLS ACCORDING TO RACE AND SEX 
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| Japanese | | Chinese 
1Q i— . | . — 
| Males | Females | Total Males | Females, Total 
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60- 69 fiat Coe ee 1 | | | 

70- 79 | &@ ¢ &i- & @--< 2 | 6 
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110-119 | 37 29 | 66 || 33 22 | 55 

120-129 | 38 24 | 62 | 18 9 | 22 

130-139 | 1 | 18 | 2 || 5 is 

140-149 | 1 | 6 | 7 | 5 5 | 10 

150-159 es ee ee ae + 1 

160-169 a“) 4 1 | | 

170-179 4 1 | 
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Qeeeeeeceeseeeee | 104.8 | 100.4 | 102.4 |] 98.2 | 97.3 | 98.0 
Qe... sees es eeee.| 115.4 | 112.8 | 114.2 || 107.7 | 107.0 | 107.4 

Median | | 
M...................| 125.0 | 125.0 | 125.0 || 117.4 | 116.7 | 117.1 
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The results are somewhat surprising, even startling. The Median 
IQ of Japanese males is 115.4; of Japanese females 112.8; of both 
together 114.2. The Median IQ of Chinese males is 107.77; of Chinese 
females 107.0; of both together 107.4. Five-sixths of the Japanese 
boys exceed a score which is exceeded only by one-half of the whites; 
and of the total Japanese group 80 per cent reach or exceed the median 
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score of the whites. The showing of the Chinese is lower than that of 
the Japanese, but better than that of white children. More than 
71 per cent of the Chinese reach or exceed the median score of the 
whites. 

From general considerations we should expect the median IQ to 
advance with the grades. The selective character of schooling results 
in the duller pupils being left behind in the lower grades. Table ]] 
shows the means for Japanese and Chinese pupils in Vancouver. The 
results are irregular in that they show no continuous increase with 
grade either for the Chinese or Japanese. They show, however, a 
curious parallelism; Grade II results are high and Grade IV low. The 
methods of promotion used in the schools may account for the 
phenomenon. 


TaBLE II.—Mep1an IQ's or JAPANESE AND CHINESE PUPILS IN VANCOUVER 
PUBLIC SCHOOLS ARRANGED BY GRADES 








| Japanese | Chinese 
| 
eT as | | tient | anual 
} eer | Median IQ | “Umer Ol | Median 1Q 
cases cases 
I 60 4 54 109.0 
I | 40 116.1 48 110.2 
Il | 82 114.0 47 106.4 
IV | 61 112.0 30 104.2 
V 20 112.5 25 108.0 
VI 11 113.5 15 107.5 
Vil 2 107 .0 4 106.2 
a ee ee Ge ee 1 114 





The validity of the foregoing results depends upon the validity of 
the test used—the Pintner-Paterson Scale of Performance Tests. In 
the original publication the authors give four methods of arriving at an 
index of mental ability, namely, (1) a year scale; (2) median mental 
age; (3) a point scale; and (4) a percentile method. They were of the 
opinion that the percentile method offered the best possibilities for 
future work. More recently Pintner has stated: ‘‘Those who have 
made practical use of the scale find the median mental age method of 
computing mental age the simplest and most accurate.” 

All four methods were used with the Vancouver results. The 
IQ’s from the year scale have been already dealt with. We have 
reason to believe them fairly reliable. In an unpublished study made 
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at the University of Toronto the writer found a Pearson coefficient 
of correlation of .81 + .023 between the IQ’s obtained from the Stan- 
ford Binet and the Pintner-Paterson Performance Tests year scale, 
with 100 children ranging in age from 8 to 11. The Pintner IQ’s 
tended to run higher than those obtained from the Stanford-Binet 
scores. A comparison of the results of scoring the Vancouver tests 
by means of the age scale and the median mental age is given in 
Table III. 

TapLE II].—Comparison OF THE AGE SCALE AND THE MepraN MENTAL AGE 

Metuops oF Scorine 500 RESULTS FROM THE PINTNER-PATERSON PERFOR- 


MANCE TEsTsS GIVEN TO CHINESE AND JAPANESE PUPILS OF VANCOUVER 
Pusuiic ScHOOLS 

















| ‘ Median age | Median age | Pearson r from 
Number of ; 

Group tested acini from year | from median | the two methods 
| . scale mental age of scoring 

Japanese males... .. | 144 11.0 8.5 .90+ .011 

Japanese females. . . | 132 11.0 8.5 .92+ .010 

Chinese males..... . | 131 10.2 8.1 .92+ .010 

Chinese females. . . + 93 | 10.0 8.2 .90 + .013 











The year scale gives consistently higher scores. If the IQ’s had 
been computed from a mental age obtained by the median mental 
age method of scoring they would have run from 20 to 25 points lower. 
We are inclined to believe that the year scale method slightly magnifies 
the true values while the median mental age method greatly reduces 
them. The consistency of the Pearson r between the two scores while 
high shows that the two methods of scoring are not exactly 
commensurate. 

The study was continued by analyzing the scores by the point 
scale method of determining mentalage. Theresults are givenin Table 
IV. The median mental ages, as may be seen by comparing them with 


TasLeE I1V.—Summary oF Mepran Mentat AGE CALCULATED BY THE POINT 
ScaLE METHOD oF COMPUTATION 

















| Number Q: ; 
Group | ak canon Qi oe Q; QD 
| 
Japanese males...... 144 7.82 8.82 10.58 1.38 
Japanese females... . . 132 7.74 8.73 10.1 1.18 
Chinese males....... | 131 7.47 8.56 9.72 1.13 
Chinese females... ... | 93 7.32 8.34 9.51 1.10 
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Table III, invariably fall between those obtained by the year Scale 
method of computation, and those by the median mental age method 
but nearer to the latter than the former. 

The evidence from the three different methods of scoring indicates 
that the IQ’s calculated from the year scale ages are probably too high, 
In other words the Japanese and Chinese are not so intelligent}as 
Table I makes them out to be. But what correction should be made 
is still uncertain. Taking every form of evidence that is available 
into consideration there is every reason for believing that the Japanese 
are the most intelligent racial group resident in British Columbia 
with the Chinese as a more doubtful second. 

The superiority is undoubtedly due to selection. In the main it 
is the Japanese and Chinese possessing the qualities of cleverness, 
resourcefulness and courage who emigrate to British Columbia; the 
dullards and less enterprising are left behind. This superiority of 
an emigrant stock is no new phenomenon in world history. There 
are those who maintain that Great Britian owes her eminent position 
in the world to the fact that only the clever and sturdy could secure 
a footing on her shores. The American army tests showed that those 
who had forced the Rocky Mountain barrier and reached the Pacifie 
slopes were of higher intelligence than the groups they left behind. 
Secondly, the groups tested in the elementary schools are probably a 
selected group; the relatively more intelligent Chinese and Japanese 
children will be sent to school in higher proportion than obtains among 
the whites. But from the political and economic standpoints the 
presence of an industrious, clever and frugal alien group, capable 
(as far as mentality is concerned) of competing successfully with the 
native whites in most of the occupations they mutually engage in, 
constitutes a problem which calls for the highest quality of states 
manship if it is to be solved satisfactorily. 


SUMMARY 


Five hundred Japanese and Chinese pupils attending the public 
elementary schools of the city of Vancouver were tested individually 
by the Pintner-Paterson ‘‘Scale of Performance Tests’’—a test which 
obviates the need of a reading and writing knowledge of English. 
The IQ’s computed from a mental age derived from a year scale of 
scoring showed a median IQ of 115.4 for Japanese males; 112.8 for 
Japanese females; 107.7 for Chinese males and 107.0 for Chinese 
females. Mental ages computed by a point scale method gave median 
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mental ages considerably lower than those obtained by the year scale 
method, and those computed by a median mental age method were 
lower still. Yet there is evidence to show that the year scale method 
gives substantially accurate results, even thoug’ they run a trifle higher 
than the true values. If these results are reliable the Japanese form 
the cleverest racial group resident in British Columbia, with the 
Chinese forming a more doubtful second. The presence of so many 
cever, industrious and frugal aliens constitutes a political and eco- 
nomic problem of the greatest importance. 
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CORRECTIONS FOR CHANCE AND “GUESS” ys. 
“DO NOT GUESS” INSTRUCTIONS IN 
MULTIPLE-RESPONSE TESTS! 


G. M. RUCH AND MARK H. DEGRAFF 


University of Iowa 


Historical—In February 1925, Ruch and Stoddard? published the 
results of two minor studies which seemed to show small but fairly 
definite losses in reliability when the method of scoring followed the 
conventional formula, Score = R — (W/n — 1), instead of simply 
the ‘‘Rights.”” In one of these experiments the population used was 
inadequate (43). The second experiment used populations of about 
135 for each determination reported. In both cases, the number of 
times that the reliability was lowered by application of the correction 
formula greatly exceeded the number of rises in reliability. These 
two studies provided no data on either of two questions more impor- 
tant than reliability alone, viz., the effects of chance corrections on 
validity, and the relative merits of instructions to ‘‘Guess”’ or “Do 
Not Guess.”’ 

In December 1925, Paterson and Langlie* published results con- 
firming those of the first mentioned study; and based upon more 
adequate populations. 

During the past year, two independent investigations have been 
in progress along roughly similar lines; each attempting to validate or 
disprove the correction formula. Wood has just published‘ his find- 
ings and the present paper presents the results of DeGraff and Ruch. 

Wood calls attention to the same issue raised by the present paper, 
v2z., that validity coefficients are raised by chance corrections but reliability 
coefficients appear to be lowered. The present investigations are in 
good agreement on this point as will be shown by Tables I and II. 
The fact that Wood’s test materials were from the fields of French and 
law and the subjects were college students makes the agreement with 
the present data on tests in history with elementary and high school 
pupils all the more striking. The main difference between the experi- 





1 A preliminary report of one of a group of studies made possible by a grant from 
The Commonwealth Fund. 

2 Journal of Educational Psychology, Feb., 1925, pp. 89-103. 

3 Journal of Applied Psychology, Dec., 1925, pp. 339-348. 

4 Journal of Educational Psychology, Jan., 1926, pp. 1-22 and April, 1926, pp. 
263-269. 
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ments which Wood reports and the present studies lies in the fact 
that Wood dealt only with instructions not to guess while this paper 
presents comparisons of the effects of instructions to guess and not to 
guess; the rdle of chance corrections being studied for both types of 
instructions. 

Materials and Method of This Study.—A total of 2453 pupils was 
ysed in the present study, distributed as follows: 
































Recall | 7-R | &R | eR | 2R | PU 

false 

sd ececeaneseches 2453 233 236 246 223 239 
“Do not guess”............) wean 229 274 229 281 263 
err S 462 510 475 504 502 





Two hundred items covering the general field of United States 
history were selected carefully by analysis of several leading textbooks. 
These were broken into two forms, hereafter designated ‘“‘Form A”’ 
and “‘Form B.”’ The further treatment of these items was that of 
“translating’’ them into five different types: recall (completion), 
7-Response, 5-Response, 3-Response 2-Response, and True-false. 
The earlier paper cited (Ruch-Stoddard) describes the method of doing 
this. It seems necessary at this point to present the full list of tests 
in order to make the plan of experimentation clear. Eighteen test 
booklets of 8 to 12 pages each were prepared as follows: 


OE errrrrr rrr err tree er er 100 items 

i Oe OID ooc.c en aeetcceevaten¥ecessuspebaneun 100 items 

3. 7-Response, Form A, ‘“‘Do Not Guess”’ instructions............... 100 items 

4. 7-Response, Form B, ‘‘Do Not Guess” instructions............... 100 times 

re, i 2 OO os aces varecacev eat hwsteeeeseweus 100 items 

rh, POD Bh. UR 6x sa sadccs cece eenteece svdudeweeah 100 items 
Ete. 


Note.—The remaining 12 tests were built in pairs of four exactly like the 
7-Response tests except that the items were constructed, successively, in 5- 3-, 
2-Response, and True-false forms. 


All of the 2453 pupils took Recall, Form A on the day of the first 
test sitting. Recall, Form B followed the next day. The third and 
last sitting provided for ten sub-groups broken purely by chance. 
Each of these sub-groups received one of the multiple-response test 
booklets (or a true-false booklet) with both Forms A and B printed 





370 The Journal of Educational Psychology 


under one cover. The test booklets were arranged prior to the third 
sitting in serial order so that, when the teacher “dealt from the top of 
the deck,’”’ she automatically sifted the 10 kinds of booklets through. 
out her group in a chance fashion. The 10 groups thus formed may 
be described as follows: 


1. 7-Response ‘‘Do Not Guess”’ group 
2. 7-Response ‘‘Guess”’ group 

3. 5-Response ‘‘Do Not Guess” group 
4. 5-Response ‘‘Guess”’ group 

5. 3-Response ‘‘Do Not Guess”’ group 
6. 3-Response ‘‘Guess’”’ group 

7. 2-Response ‘‘Do Not Guess”’ group 
8. 2-Response ‘‘Guess”’ group 

9. True-false ‘‘Do Not Guess” group 
10. True-false ‘‘Guess”’ group. 


The data presented in Tables IV and V show exactly the ability 
of these sub-groups. 

Correlations of one of the multiple-choice tests with the criterion 
(Recall) are termed here “Validity Coefficients.”” The reliability 
coefficients are invariably the correlations of Forms A and B of the 
same test. 

The items were the same 100 in all A forms throughout and the 
same 100 in all B forms throughout, as far as it 1s possible to assert 
that an item remains the same when changed successively from Recall, 
to 5-Response, to True-false, etc., form. 

The two different sets of instruction used are given below; the 5- 
Response form being chosen as the example. The wording had to 
be changed slightly for the other multiple-choice forms to make the 
obvious adaptations necessary. 


‘Do Nor Guess”’ INSTRUCTIONS 


‘Nore CarEFULLY.—If you are in doubt about the answer to any question, 
leave it blank. Do not guess! You will be penalized for all wrong answers. 
The tests are scored in such a way that you will lose more than you gain by guessing. 

““REMEMBER.—Do not guess. Answer only those that you are reasonably sure 
about.”’ 


‘‘Guess’’ INSTRUCTIONS 


“Note CaREFULLY.—Do not leave any question unanswered. If you don't 
know, guess. It is better to guess than to leave a question blank because you 
have one chance in five of getting it right by pure guessing. You should try to 
make as logical or shrewd a guess as possible . . . 

‘““REMEMBER.—Try to answer every question. Guess if you don’t know.” 
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The above instructions were observed surprisingly well. Very 
few papers showed any blanks; the small number of questions which 
were left unanswered might well be attributed to oversights or hasty 
reading. 

The Results.—Table I presents the reliability coefficients and Table 
III the validity coefficients. Table III shows the r’s corrected for 
attenuation. Tables IV and V give the means and sigmas for all 





TABLE I.—RELIABILITY COEFFICIENTS 



































**Guess”’ | “Do Not Guess” Differences 
—~ «<< +. Jeo x. @ otek 
ies 1 |2|a3|a4 | 5 | 6 
Uncor-, Cor- | Differ- Uncor-| Cor- | Differ- 
rected | rected | ence, rected | rected ence, 1 3 tie Ss 
| 2-1 | | 5-4 
Bossi (,960).......... | 
7-Response.........-. .800 | .839 |+.039° .886 | .907 |+.021|+ .086)| + .047|+ .107| + .066 
5-Response........... .864 | .902 |+.038! .862 | .882 |+.020)— .002! — .040/+ .018| — .020 
3-Response........... .837 | .858 |+.021, .886 | .890 |+ .004/+ .049/+ .028/+ .053) + .032 
2-Response........... .745 | .864 |+.119) .859 | .843 |— .016)/+.114 — .005/+ .098| — .021 
eee eekem | .641 | .780 +139) .885 | .837 |—.048 + .241 + .105 + .196)+ .059 
| | | j 
ee i se.) ee | .876 | .872 | 











Nore.—Values in bold-faced type show all differences which are 3.0 or more times their prob- 
able errors; and hence are probably “‘significant”’ differences. 


TABLE II.—Vauipity COEFFICIENTS 















































| **Guess”’ “Do Not Guess” | Differences 
| - ] | | | 
Recall vs. J 1 Ps ss. 4 | “en -. | | 
|Uncor-| Cor- |Differ- Uncor-| Cor- |Differ- , , | 4 4 | | 
{rected rected’ ence, rected | rected | ence, | =! | - =! “S 
| | | 2-1 | | | 5-4 | | 
7-Response A......... .871 | .873 | + .002 .927 | . 926 Pe bape Hee + .055 + .053 
7-Response B......... , -816 .861 | + .045) .872 | .898 + .026 + .056 + .011 + .082 + .037 
5-Response A......... _ .907 | .910 |+.003) .891 | .918 |+.027)— .016|— .019| + .011)— .008 
5-Response B......... | .860 | .903 |+.043) .836  .870 |_+.034)— .024)— .067|+ .010 — .033 
3-Response A......... .838 | .848 |+.010) .845 | .915 | +.070|+ .007|— .003| + .077/ + .067 
3-Response B......... | «.797 | .875 |+.078) .852 | .902 |+.050/+ .055 — .022)+ .105 + .027 
2-Response A......... | .859 | .865 |+.006| .740 | .775 +.035/—.119|— .125 — .084 — .090 
2-Response B......... .735 | .806 + .071| .752 | .868 |+-116] + .017 — .054|+ .133 + .062 
True-false A.......... | .804 | .839 |4+.035) .749 | .860 |+.111|— .055 — .090| + .056 + .021 
True-false B.......... | 675.801 +.126 .768 | .856 + .088|+ .093'— .033|+.181 +.055 
er | .815 | .868 |...... | -823 | .890 | | | 
Proportion of “significant” ( + | 2:10 | |... | 5:10 | 2:10 | 1:10 | 7:10 | 3:10 
differences (bold-faced 5 ae | .... | 0:10 | 1:10 | 3:10 | 1:10 | 1:10 
type) to total number of | | Both | 2:10 beens 18 8:10 | 4:10 
differences (10) 





:10 | 3:10 | 4:10 
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TaBLE III.—CorrexLaTions OF RECALL (CRITERION) AND MULTIPLE-REspoygp 
TEsts, CORRECTED FOR ATTENUATION 





| 
: . “Do Not Guess” 
‘‘Guess’’ instructions . 


























instructions 
} 

Uncorrected | Corrected | Uncorrected | Corrected 

for chance | for chance | for chance | for chance 
7-Response............ | 967 971 | ~~ .980 982 
5-Response............ .974 .975 .953 .976 
3-Response............ .916 .954 .925 | . 988 
2-Response............ .945 .921 .838 .917 
True-false............. .943 .953 .827 | . 962 
tN di a i te .949 | .955 .905 | . 965 











TABLE 1V.—MEAN SCORES 
































Multiple-response Multiple-response 
B 
Recall " Recall 
Group A B 
Uncor- | Cor- Uncor- | Cor- 
rected | rected rected | rected 
| 
7-Response (G)!....... 25.9 50.0 41.5 26 .2 39.6 32.6 
7-Response (N)?....... 27 .6 44.9 40.0 27 .6 37.2 33.1 
5-Response (G)........ 25.7 54.2 43.4 26.9 45.5 35.4 
5-Response (N)........ | 28.0 48.8 42.3 28.6 42.1 | 36.4 
3-Response (G)........ | 25.6 62.2 43 .6 26.1 55.5 36.6 
3-Response (N)........ | 27.4 | 54.1 | 41.9 | 27.5 | 48.2 | 36.1 
2-Response (G)....... | 26.7 | 71.7 | 43.6 | 27.4 | 67.2 | 37.1 
2-Response (N)........ | 33.4 65.1 45.8 33.3 60.3 40.2 
True-false (G)......... | 27.4 | 65.8 | 32.3 | 26.8 | 61.3 | 26.0 
True-false (N)......... | 27.5 51.0 | 30.8 27 .6 47 .6 26.8 








1(G) means instructions to ‘‘Guess.”’ 
2 (N) means instructions ‘‘Do Not Guess.” 


tests. In Tables I and II all statistically “‘significant’’ differences 
are printed in bold-faced type. All other differences are less than 
three times their probable errors. For this reason the probable errors 
are not quoted for the r’s in order to simplify the printed tabulations. 
(The numbers of cases are given in a preceding section, however) 
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TABLE V.—STANDARD DEVIATIONS 





























Multiple-response Multiple-response 
Recall . Recall of 
A B 

Uncor- | Cor- Uncor- | Cor- 

rected | rected rected | rected 
7-Response (G)........ 15.1 17.4 20.0 16.3 20.7 20.9 
7-Response (N)........ 17.1 19.7 21.0 17.5 20.2 20 .6 
5-Response (G)........ 16.9 17.0 20.8 17.3 20.9 22.1 
5-Response (N)........ 16.8 19.8 21.1 17.6 21.1 21.2 
3-Response (G)....... .| 15.0 13.9 19.8 16.4 16.3 19.6 
3-Response (N)........ | 15.7 16.2 20.5 16.5 19.6 20.0 
2-Response (G)........ | 16.4 | 11.4 | 22.4 | 17.0 | 13.8 | 20.6 
2-Response (N)........ 20.6 16.1 20.7 19.7 17.9 20.1 
True-false (G)......... | 16.5 | 10.5 | 19.4 | 16.9 | 12.6 | 17.6 
True-false (N)......... 15.9 17.4 | 18.8 | 16.5 18.3 17.3 








In order to conserve space and discussion, the results given in 
Tables I to V will be presented as a summary with references to the 
table supplying the basis for each conclusion stated. 


SUMMARY AND CONCLUSIONS 


1. When instructions are given to ‘‘Guess,’”’ the chance correction 
formula raises the reliability. (Columns 1 and 2 of Table I.) 

2. When the subjects are instructed ‘‘ Not to Guess.” the chance 
correction appears to be of no value in increasing reliability. (Col- 
umns 4 and 5 of Table I.) » 

3. As far as sheer reliability is concerned, it seems to be true that 
uncorrected ‘‘Do Not Guess”’ instruction scores are best. (Columns 
1, 2, 4, and 5 of Table I.) 

4. Instructing subjects to omit doubtful items rather than guessing 
seems to have some effectiveness. (Tables I and IV.) 

5. Conclusions from the behavior of reliability coefficients alone 
are not defensible since validity coefficients show different trends under 
corrections for chance. (Comparison of Table I and II.) 

6. The observed order of decreasing validity for the four techniques 
studied was as follows: 
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Instructions not to guess and chance corrections 

Instructions to guess and chance corrections 

Instructions not to guess and without chance corrections 

Instructions to guess and without chance corrections. (Table II.) 

7. The practice of instructing pupils to guess appears to have no 
particular merit, and it does not insure a better ‘‘working out”’ of the 
formula for corrections for chance. (Columns 2 and 5 of Table II.) 

8. Due to inconsistencies in the behaviors of the various deter. 
minations in Table II, it would seem wise to repeat such experiments 
many times with varied materials. However, it is not likely that 
totally unambiguous answers to the issues we have raised will be had 
unless populations of at least 5,000 to 10,000 are used in order to reduce 
unreliabilities of differences. This argues that, for most practical 
purposes, the exact technique to be chosen is not a matter of tremend- 
ous importance. We can stand on reasonably sure ground, for the 
present, to instruct pupils not to guess; leaving the matter of correc- 
tions for chance to the whim of the examiner. 

9. When the various tests are corrected to allow for attenuation 
due to errors of measurement, there is no evidence that the tests used 
differ greatly in the functions which they measure. (Table III) 

10. The effect of correction for chance on the mean scores of True- 
false tests is markedly different from such corrections applied to Mul- 
tiple-choice tests proper. (Table IV.) Corrected True-false means 
proved to be about 10 points lower than those of the 2-Response 
arrangements of the ‘‘same”’ items. This agrees with the earlier 
findings of Ruch and Stoddard. 

This point needs further study. The issue involved may prove 
to lie in the situation outlined below. In the case of the items: 


1. A famous southern general was Lee.......... true false 


2. A famous southern general was................ Lee Grant 





If we assume total ignorance of the man, Lee, on the part of a 
pupil, it is obvious that the chances are 50:50 that he will succeed on 
Item 1. We can make the same assumption in the case of Item 2, 
but assume at the same time that the pupil knows that Grant was a 
northerner. His chances are, therefore, 100:0 of success. In thecase 
of Item 2, he would have to be totally ignorant of two facts, to have the 
situation resolve itself into a 50:50 break for success. The pupil’s 
response in a genuine 2-Response test may be made either upon the 
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basis of knowledge that one thing is true or upon knowledge that the 
other thing is untrue. If pupils are 100 per cent informed about both 
facts, or 100 per cent informed about one fact and 100 per cent ignor- 
ant about the other, they succeed on the test item in 2-Response form. 
They succeed half the time if they are 100 per cent ignorant about 
both facts. 

The foregoing may or may not be the reason for the differences we 
have noted but it is a tempting hypothesis. 

11. The variability of the scores is increased by applying correc- 
tions for chance. (Table V.) 





THE INTELLIGENCE OF PREPARATORY SCHOOL 
STUDENTS 


HAROLD E. JONES 


Columbia University 


Intelligence tests in private (non-parochial) schools have as a rule 
shown higher averages than those yielded by the general school popula- 
tion. This superiority has been variously interpreted as due to ip- 
adequacy of the tests,! to differential standards of admission,’ and to 
the fact of selection from a superior social group. 

The Army Alpha, Form 6, was given by the writer to 125 boys at 
the Riverdale Country Day School, in New York. Eighty-three in 
Grades IX to XII (Forms 3 to 6) were tested May 1, 1924. Forty- 
two additional, belonging for the most part to new sections in Grades 
IX and X, were tested November 6, 1925. The records as presented 
below include all of the members of these classes with the exception of 
three boys who had been in this country a short time, and whose tests 
were rejected because of language handicap. 


TABLE I.—DIsTRIBUTION or 125 Cases In ARMY ALPHA 
| 








Score | Grade IX | Grade X | Grade XI | Grade XII| Total 
190-99 os oe 2 Rs 2 
180- 1 3 4 
170- 9 3 12 
160- 2 7 4 4 17 
150- 5 7 5 3 20 
140- 9 9 4 1 15 
130- 1 8 2 1 20 
120- 4 5 1 10 
110- 10 5 15 
100- 4 1 1 6 
90- a 
80- 4 4 




















The median score for the entire group is 145 + 2.0. The distri- 
bution is overweighted by members of the two lower grades, owing to 
the fact that incoming sections were tested in these grades in succes- 
sive years; the average nevertheless is higher than the averages which 
have commonly been found in colleges and universities.4:5 Only 
one report seems to have been published on the use of the Army 
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Alpha in preparatory schools. 


His results are compared with the Riverdale data in the following 
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In a survey of the Hotchkiss School in 
Connecticut, J. E. Anderson? tested 278 boys in the four upper forms. 


tables. (The Riverdale measures were not calculated from Table I, 


but from a frequency tabulation in steps of 5.) 



































TABLE II 
Q: Median Q; Cases 

Grade XII 

ee 154 164+3.7 179 15 

Rs chs cuawh 142 156+1.5 176 75 
Grade XI 

ee 141 152+3.0 162 19 

Cv ccetcas aaa 139 154+1.4 165 83 
Grade X 

Ns shia hie 135 1474+1.9 166 52 

Ee 132 1444+1.7 159 60 
Grade IX 

INS ics «ae eee 113 122+2.5 138 39 

Hotchkiss............ | 114 126+1.7 140 60 

Tas_LeE IJ].—PeR Cent 1n ALPHA LETTER GRADES 
A B C+ 

Grade XII 

NS. a cir'g sats ig & SiS ee ee 93.3 6.7 

nee ee, Sef 81.1 17.3 1.3 
Grade XI 

ie ve wo de ha hha oben 84.2 15.8 

IS, is gba 4yaeaede keen 82.3 15.7 2.4 
Grade X 

is 6 as sin iste Gs aw os eras 75.0 23 .1 1.9 

ai a ae ws ig an 64.7 33 .2 1.7 
Grade IX 

Ae me 35.9 51.2 | 12.9 

IL «<b yw Geld ale wd oe 31.5 53.0 14.9 








The two schools are seen to be nearly identical in the distribution 
The differences between the medians at 


of intelligence test scores. 


each grade level are so slight as to be unreliable. 





The student bodies 
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at Hotchkiss and Riverdale are recruited from the same social groups, 
the fathers ranking in the top two classifications of the Taussig Scale, 
and ranking above a PE value of 15 in the Barr Scale of Occupational 
Status. The majority of the graduates of the two schools enter 
Harvard, Yale, Princeton, or such smaller colleges as Williams or 
Amherst. 

If we compare the pooled results for Hotchkiss and Riverdale, with 
the data obtained from the use of the Army Alpha in high schools and 
colleges, we find an interesting series of differences. The high school 
and college norms are taken from the Manual of Instruction issued by 
the Bureau of Educational. Measurements and Standards, Kansas 
State Teachers College. The high schools represented are in New 
York, Michigan, Wisconsin, Illinois, Kansas and Iowa. 











TaBLeE IV 

| Median | Cases 

j | 
Preparatory schools, Grade XII............... 156+1.3 | 90 
I SEE RE cc vcwaccecevccetees 120 1215 
Se MIT SII, oo as cede cenccceeescenns 144 797 
Preparatory schools, Grade XI................ 153 +1.3 102 
6 ree | 117 1518 
Colleges, junior year............ pabeetaded eye 141 1147 
Preparatory schools, Grade X................. 145+1.4 112 
ET hve ssied ens seensceeeeens 111 1194 
Colleges, sophomore year..................... 137 1615 
Preparatory schools, Grade IX................ 126+1.3 99 
I I ewe cccccssesaseeses 97 1593 
Comiemes, TYGGRIMAM YOOF.... 2c cscccccccccces | 129 3310 








At the successive grade levels, the preparatory school medians are 
respectively 29, 34, 36 and 36 points higher than the high school 
medians. These differences are in each case more than 10 times the 
PE of the difference, and represent from 2 to 3 PE of the distribution 
of preparatory school scores. 
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TaBLE V.—PeER CENT IN ALPHA LETTER GRADES 























| A B C+ C 
135-212 | 105-134 75-104 45-74 

Preparatory schools, Grade XII. | 83 16 1 

High schools, Grade XII........ | 29 44 23 4 
Colleges, senior year............ | 60 28 11 1 
Preparatory schools, Grade X1.. | 83 15 2 

High schools, Grade XI......... | 25 45 26 4 
Colleges, jumior year............ 58 31 10 1 
Preparatory schools, Grade X.... 70 28 2 

High schools, Grade X.......... 17 43 34 6 
Colleges, sophomore year........ 52 32 13 3 
Preparatory schools, Grade IX... 34 53 13 

High schools, Grade IX......... 8 31 46 14 
Colleges, freshman year......... 45 | 35 17 3 





In Grade IX, 90.0 per cent of preparatory school students equal or exceed the 
high school median for the corresponding grade 

In Grade X, 97.7 per cent of preparatory school students equal or exceed the 
high school median for the corresponding grade 

In Grade XI, 92.4 per cent of preparatory school students equal or exceed the 
high school median for the corresponding grade 

In Grade XII, 97.8 per cent of preparatory school students equal or exceed the 
high school median for the corresponding grade 

91.6 per cent of preparatory school seniors equal or exceed the average of college 
freshmen 

82.0 per cent of preparatory school seniors equal or exceed the average of college 
sophomores 

78.7 per cent of preparatory school seniors equal or exceed the average of college 
juniors 

74.7 per cent of preparatory school seniors equal or exceed the average of college 
seniors 


What factors enter into the determination of these differences? 

1. The differences may be due in part to inequivalence of the 
several forms of the Army Alpha. ‘This is a factor which is commonly 
ignored, although it has been shown that different forms may vary as 
much as 6 points in difficulty in the middle of the scale, and as much as 
10 points in the upper quartile’ ®?®*®, The most difficult of 
the five forms, however, (Form 6) was used in the preparatory school 
surveys and hence (since various forms have been employed in the 


ie 
iF 


380 The Journal of Educational Psychology 


high school standardization) the true differences tend to be slightly 
greater rather than less than those that have been shown. 

2. The preparatory school superiority may reflect a sectional 
difference in intelligence. The two preparatory schools are in the 
east; the majority of the cases in the high school standardization are 
derived from institutions in the middle west. Several considerations 
however, point to this factor as being of negligible importance. 

(a) While the army draft results showed a superiority of one eastern 
camp, the chief sectional distinctions were between the north and 
the south, with little difference, in general, between the northeastern 
and middle western draft’ ‘?: ®°-®7) 

(6) Results from such eastern institutions as Syracuse, Brown and 
Rutgers show little deviation as compared with western state uni- 
versities,*: 5+ 7 PP. 869-872) 


(c) Reports from eastern high schools do not reveal a sectional 
advantage. ® 5 9 

3. The preparatory school superiority may be determined in part 
by racial differences in intelligence. Nearly all members of the group 
are from native born parentage, while the high schools include an 
unknown proportion of pupils of foreign born and negro parents, 
In the middle western schools it is probable that this factor is of minor 
significance, particularly in view of the large proportion of superior 
groups (German and Scandinavian) in the western foreign born 
population. Recent evidence is available from the testing of 1442 
pupils of American born parentage, in the Hartford, Connecticut, 
Public High School. The median Alpha scores were 94 at Grade IX, 
126 at Grade XI, and 136 at Grade XII. These scores are higher, in 
Grades XI and XII, than the high school averages reported in Table 
IV, probably due largely to the fact that the test was somewhat 
modified for high school use’ ® ®, As they stand, the medians for 
this group of New England parentage remain from 20 to 32 points 
inferior to the preparatory school medians. 

4. The differences may be due in part to a higher relative chrono- 
logical age in preparatory school pupils. From the study of Madsen 
and Sylvester’ age statistics are available for three of the high schools 
whose Alpha scores have been reported (Sioux City, Iowa; Rockford, 
Illinois; and Madison, Wisconsin). I have computed an unweighted 
average of the chronological ages at each grade level in these three 
schools, and also an unweighted average for Riverdale and Hotchkiss. 
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TaBLE VI.—MeEan CHRONOLOGICAL AGES AT THE TIME OF EXAMINATION 

















a High schools 
RE Sch ¥edveas beh. ese 17.0 17.1 
RE «>. \ saatheuwcel ta. ealebioaeks 16.4 16.7 
I, ins doing «orignal x svaiaa( ies | 15.8 16.1 
i nn i nu eek sha Tl | 14.8 14.8 





The preparatory school students tend to average slightly younger, 
but the differences are not great. The differences in “‘mental age,”’ 
as computed from conversion tables® are of course marked. 


Taste VII.—Mepian Mentau AGEs 














Preparatory : 
tase High schools 
IR ss kahve ona nek ee eee 19.4 17.0 
in i a es folk cir ia ai ne 19.2 16.8 
Grade X........... Seren eT ee ee ee eee | 18.7 16.4 
reg: rie: ter Ver as ine eee | 17.4 15.5 





It is apparent from these data that the average preparatory school 
pupil could be graduated much earlier than is now the case, if a policy 
of rapid advancement were regarded as expedient. 

5. It may be contended that the preparatory school superiority 
is largely due to a sex difference in Alpha performance; the high school 
and college norms are derived for the most part from co-educational 
institutions, and it has been shown that girls tend to score lower than 
boys. The actual sex difference is, however, slight. Madsen and 
Sylvester!® report an average difference of about 6 points in high 
schools; Wentworth’? finds a superiority of 8.7 points, in comparing 
high school senior boys with girls. Since the sex difference must be 
reduced one-half in making a correction, it is obvious that this factor 
contributes very little to the margin of advantage of the preparatory 
school group. 

6. J. E. Anderson interprets his findings at Hotchkiss with the 
statement: ‘‘If the same principle holds in preparatory schools as in 
the college, we may conclude that the high scholastic requirements 
in conjunction with the large number of applicants for a limited number 
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of places, constitute the primary factors in determining the exceptional] 
selection of intelligence, as measured by tests, in the Hotchkiss student 
group.”? These factors have not been demonstrated to be of major 
importance at Hotchkiss. At Riverdale they are known to be negli- 
gible. During the past 6 years individual tests have been given to 
all applicants for admission to Riverdale. The median IQ (Stanford 
Revision) has varied each year not more than 2 points from 117. 
A study of applications shows that less than 5 per cent have been 
rejected on account of a poor IQ, or because of lack of scholastic 
preparation; the average intelligence of all applicants is not substan- 
tially lower than that of the members of the school. 

7. There remains the possibility that the chief selective agencies 
are economic and social. Such a finding would he in accord with the 
present large body of evidence as to the relationship of intelligence 
scores and social status. In the universities which recruit a majority 
of their students from preparatory school groups, a higher median 
Alpha has been found than in state universities. The following 
comparative table is significant. 


TaBLeE VIII.—Per Cent In ALPHA LETTER GRADES 





A B C+ | C | Cases 


























Ee ee ne 85 14 1 | 400 
Preparatory schools, Grade XII.......... 83 16 1 -- | 408 
College freshmen norms®................ 45 35 17 3 | 3310 
High schools, Grade XII................ 29 44 23 4 | 1215 





Sixty per cent of Yale freshmen are graduates of preparatory 
schools; their distribution is practically identical with that which we 
have found for preparatory school students in Grade XII, and we may 
infer that for this group, entrance to college does not involve an addi- 
tional selection. The Yale freshmen from high schools, on the other 
hand, are obviously selected from the upper end of the high school 
distribution. However distasteful the fact may be to equalitarians, 
it appears that the socially prominent schools and universities derive 
their intelligence ranking from the fact that their students are supplied 
chiefly from the upper social classes. The selection is exerted through 
_ (a) tuition and living costs, and (6) the selective appeal of an institu- 
tion of high social prestige. At present we can only conjecture to what 
extent the tested differences are due to inheritance, and to what 
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extent they are traceable to physical and cultural advantages, particu- 
larly during the pre-school period. 


1. 


10. 


11. 


12. 
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REJOINDER ON BURT’S REGRESSION EQUATION 
KARL J. HOLZINGER AND FRANK N. FREEMAN 
University of Chicago 


If the issue between Professor Thomson and ourselves were merely 
one of statistical technique, we might well let the matter rest with his 
article in the May issue of this journal. For on the purely statistica] 
issues we are in agreement with him. On the interpretation of Burt’s 
equation, however, we still differ with him, and we are venturing to 
prolong the discussion one step farther in order to attempt briefly 
to make clear the point at issue. 


Points oF AGREEMENT 


We agree on the theoretical interpretation of the regression equa- 
tion. We agree that it is permissible to make guesses as to probable 
causal relationships between factors which are found to be correlated, 
provided it is clearly stated that the guesses are guesses, and provided 
further that the relative probability of the various possible guesses 
is weighed impartially. 


PoINnts oF DISAGREEMENT 


The main points of disagreement relate to two matters. These 
are, first, the interpetation of the relationship between Binet score and 
schooling; and second, the interpretation of the meaning of the score 
in the Burt Reasoning Test. | 

We still hold that the hypothesis made by Burt and by Thomson 
concerning the causal relationships between the factors of Burt’s 
equation does not take due account of the various possibilities. 
We maintain that it is at least equally reasonable to suppose that the 
ability which determines the child’s Binet score is also responsible for 
the quality of his school work as to suppose that the quality of his 
school work is responsible for his Binet score. When Thomson says, 
‘“The most probable hypothesis here is simply that school work assists 
the child to answer Binet questions,’ we can only interpret the 
statement by assuming that he still labors under the confusion be- 
tween the two meanings of school work. He is here, apparently, using 
‘“school work” to designate the amount of school work the child has 
undergone, not the quality or grade of his school work. 
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We believe the probable relationship can be expressed by apply- 
ing Thomson’s words to the second equation 


Quality of School Work = .69 Binet —.05 Burt + .47 Age 


Thus, Binet mental age is more potent than any other of these factors 
in enabling us to estimate quality of school work. We venture the 
working hypothesis that it is the most potent causal factor in school 
work. We see nothing in this hypothesis which requires ‘heroic 
pedantry” for its acceptance. We are interested in the issue, not 
primarily from the point of view of formal correctness, but because we 
believe the interpretation which has been made by Burt’s equation 
is not in fact the most probable one. The proponents of the view that 
mental tests are unaffected by the amount and character of schooling 
have rightly been accused of a biased interpretation of the relationship 
between schooling in this sense and scores on intelligence tests. The 
interpretation which has been made of Burt’s equation shows equal 
bias on the other side. It is important that both sides of the contro- 
versy interpret the scientific findings with impartial judgment. 

To sum up, on the first point, Burt’s data give no evidence concern- 
ing the effect of the amount and character of schooling upon the Binet 
score. The relationship which it establishes is between quality or 
grade of school work and Binet score. It is quite reasonable to suppose 
that the amount of schooling on2 has had is an important factor in his 
Binet score. It is equally reasonable to suppose that the ability which 
is measured by the Binet Seale is an important factor in determining 
the quality or grade of school work. 

On the significance of the score in the Burt Reasoning Test Thom- 
son writes, ‘‘ But an obvious working hypothesis ready to hand is that 
in Burt’s Reasoning Score native intelligence alone is sufficient . " 
Let us recall the equations: 


Quality of School Work = .69 Binet — .05 Burt +- .47 Age 
and 
Burt Score = .95 Binet — .10 Quality of School Work + .07 Age 


It is apparent from these results that the Burt score contributes 
little to the prediction of the quality of school work and, similarly, 
quality of school work contributes little to the prediction of the Burt 
score when the remaining variables are held constant. The probable 


.6745 
ae er ——. = 04, 
rror of the coefficients, —.05 and —.10, is of the order 4/300 


s0 that they are probably insignificant. 
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Thomson’s interpretation would appear to be that Binet score jg 
affected by school work while the Burt score is a measure of intellj- 
gence independent of schooling. Therefore the Burt score measures 
native intelligence while the Binet score measures schooling. We have 
already commented on the view that the Binet score measures school- 
ing. With reference to the interpretation of the Burt score we have 
the following observations. First, there seems to be no connection 
between the negative fact that the Burt score is not a measure of 
schooling and the positive hypothesis that it measures native intelli. 
gence. Atany rate, if the Burt score measures intelligence it is, accord- 
ing to Thomson’s reasoning, a kind of intelligence which is of no 
practical importance. Intelligence which is unrelated to the quality 
of school work is not the sort of capacity which we usually have in 
mind when we use the term. If the capacity which is measured in 
the Burt score is intelligence, then Thomson’s interpretation would 
seem to imply not only that degree of attainment in school does not 
affect intelligence but also that intelligence does not affect the degree 
of attainment in school. Intelligence, in this view, must be some sort 
of useless ornament, enabling us to work certain puzzles which the 
psychologists have devised, but of no conceivable value in a practical 
work. 

An alternative method for showing the value of the Binet and Burt 
tests for predicting quality of school work is to work out the multiple 
correlations as follows: 

(1) R (School Work) (Binet) = .91. 

(2) R (School Work )(Burt) = .75. 

(3) R (School Work) (Binet + Burt + Age) = .9330. 

(4) R (School Work) (Binet + Age) = .9330. 

These results show that the Binet test taken alone is a better predicter 
of quality of school work than is Burt. It is also apparent that drop- 
ping the Burt test as shown in equation (4) reduces the predictive 
value of Binet and Age by anunappreciable amount. We may there- 
fore conclude that the Binet test should be used in preference to Burt’s 
for predicting quality of school work, and that the latter test will 
not improve the forecast when the Binet score is already known. 
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THE EFFECTS OF PRACTICE ON THE CORRELA- 
TIONS OF THREE MENTAL TESTS 


RALPH GUNDLACH 
University of Washington, Seattle, Wash. 


The experiment this paper describes was an attempt to answer the 
following questions about the intelligence testing of college students: 

1. With extensive practice does the dispersion of the group in- 
crease, decrease, or remain constant? The common lay-view seems to 
hold that with equal amounts of practice, different subjects will 
approximate each other’s performance. Some psychologists hold 
that the differences increase. 

2. To what degree are intercorrelations between the performance 
of the same subjects in various tests modified by extensive practice? 
That is, will the intercorrelations approach 1.00; will they approach 
0.00; or will they remain practically constant. Special abilities may 
tend to be accentuated by further practice, lowering the intercorrela- 
tions; or initial differences may be smoothed down so that the subject 
reaches practically the same position in all tests. However, if low 
intercorrelations on the first trials are due to the varying amounts of 
practice the subjects have had, the intercorrelations of later trials will 
be higher. 

3. How accurately does a single short test, such as is found in every 
intelligence test, indicate the final position that would be attained at 
the practice limit of a given test? Does an initial snap-shot of an 
ability indicate the final capacity in that ability, or do single tests 
indicate only how much training the various subjects have had in the 
various functions? 

4. Do the correlations of such tests with a criterion of intelligence 
increase, decrease, or remain constant, with further practice? 

5. Incidentally, are there any sex differences? 

Among the researches that have been done on retesting subjects 
there are several which may be considered here with profit. Thorn- 
dike practiced 28 subjects 95 times in multiplying mentally three- 
place numbers by three-place numbers. He gives the times for his 
subjects in the total of the first five and the last five triais.2 He did 
not correlate the various performances, however. From these data a 
correlation of .71 was computed. How that would compare with the 





1See Starch: Educational Psychology, p. 88 for a typical example. 
* Educational Psychology, Vol. 2, p. 147. 


387 


388 The Journal of Educational Psychology 


other reliability coefficients of the test it is impossible to say; but the 
initial performance did indicate fairly well the ultimate capacity. 

Wells! tested five men and five women 6 days a week for 30 days on 
mental addition and cancellation of digits. Although again, he did not 
work any correlations, he does give the practice curves. They show 
practically the same results as were obtained in the experiment herein 
reported. The spread of the cases with practice seemed to increase in 
ratio to the mean, except in the case of women in cancellation. The 
early positions of the subjects were maintained fairly well throughout 
the practice period. The intercorrelations between the tests would 
be positive, but not perfect. 

Hollingworth reports, from the retesting of adult subjects, results 
which disagree markedly from these.2 He holds that even the first 
half-dozen or so trials have little or no diagnostic value in predicting 
final capacity in any function. And that low intercorrelations 
between initial trials of various tests are not true of the real ability, as 
shown at the end of practice. ‘“‘With practice, then, the average 
correlations of all tests become positive, and the coefficients become 
greater the longer the practice is continued.”* To explain this he is 
driven to possess some “general factor” of intelligence. 

It will be advisable to consider his experiment in some detail. He 
tested 13 subjects 205 times on the following tests: 


1. Adding.—Adding 17 mentally to each of 50 two-place numbers and reciting 
aloud the correct answer. Order of numbers random at each trial. Record with 
stop watch, time required for perfect score. 

2. Naming Opposites—Correctly naming opposites of 50 adjectives which 
occurred each time in random order. Record, time required for perfect score. 

3. Color Naming.—The Columbia laboratory form of the test, with 10 repeti- 
tions of each of the 12 colors. Position of card changed at each trial. Record, 
time required for perfect score. 

4, Discrimination . Reaction.—Discriminating between red and blue, and 
reacting correctly with appropriate hand. Record, average time, in sigma, and 
number of false reactions. 

5. Cancellation.—Crossing out digits from the Woodworth-Wells form of this 
test. Record, time required for 75 correct cancellations of equally difficult digits. 

6. Coordination.—The familiar three-hole test, for accuracy of aim. Record, 
time required for 100 correct strokes. 

7. Tapping.—Executing 400 taps at maximal speed with hand stilus, right 
hand, elbow support. Record, time required. 





1 Relation of Practice to Individual Differences. American Journal of Psy- 
chology, 1912, p. 80. 

2 Vocational Psychology, Chap. XI, pp. 245-265. 

$P, 251. 
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The subjects took these tests 205 times, though no improvement 
was found in the last 100 trials. The average correlation of each test 
with all others showed a marked and steady increase with practice, 
(except in the case of discrimination and coordination), increasing in 
general from less than 20 to more than 50. This increase Professor 
Hollingworth ascribes in part to (1) variability of individual per- 
formance, and (2) to the change in the character of the tests, through 
practice with them; but most important of all, (3) some further factor 
such as ‘‘general ability,” or “‘general intelligence.”’” And ‘‘if there 
is such a thing as general ability or ‘general intelligence’ we should 
expect all samplings of that ability to correlate more and more as the 
measures came to be truer samples. We might indeed expect to find 
evidences of this general ability only when measuring the ‘ultimate 
capacity’ of the individuals concerned. The momentary ability 
revealed in initial trials, or even in the first half-dozen trials, in a given 
set of tests might well be expected to show low degrees of correlation. 
These trials would not be measures of ultimate capacity, but would 
largely be determined by previous practice, chance variability, momen- 
tary attitude and initial method of attack. They would, in short, be 
samplings only of momentary ability, not of final capacity.’’! 

Professor Hollingworth’s evidence is the table showing correlation 
of each test with all the other at a given level, and his table of correla- 
tions of “‘ultimate capacity” with capacity at different points in the 
curve of learning, which is given below. 


TaBLE SHOWING THE CORRELATION OF ULTIMATE Capacity wiTtH CAPACITY 
AT DIFFERENT POINTS IN THE CURVE OF LEARNING? 




















The Test | Prelim- | 5th | 25th | 50th | 80th | 130th | Final 

| inary | trial | trial | trial | trial | trial | trial 

eae .19| .87]| .87]| .97] .96 | 1.00 

Opposites...............|  .08 .62| .49| .83]| .94 .98 | 1.00 

Color naming........... 68 .89 | .86) .91]| .97)| .97 | 1.00 

Discrimination.......... .67 .62 .60 .50 .50 .79 | 1.00 
Cancellation............ 67 68 | .88| .69 | .93 | (1.00) 
Coordination............|  .52 .79| .77| .90]| .95 | (1.00) 

theyll | .23 48! .63/ .68! .69] .89 | 1.00 

Averages............... | .41 | .61| .73| .77] .85| .92 | 1.00 

















10p. cit., pp. 254-255. 
* From Hollingworth, Vocational Psychology, p. 259. 
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What Professor Hollingworth did was to take several tests of 
varying complexity, and practice his subjects in these until the 
responses had become practically automatic. This simple response jg 
what Professor Hollingworth called ‘final ability” in the complex tests 
which his subjects performed on the first trial. The pooled “ fing] 
ability”’ to perform several such simplified responses he called “ genera] 
ability.” The increase in intercorrelations which he found as practice 
continued he ascribed to the fact that with practice his tests 
more accurately tapped “‘general ability.”” But the time required to 
make simple automatic responses is not usually considered a test of 
intelligence. 

I do not believe that Professor Hollingworth’s conclusions are 
justified by the facts he presents: 

1. Since he used the same test material atevery trial. Several of his 
tests, as he himself indicated, would tend to measure a quite different 
function after a number of repetitions. Adding and opposites espe- 
cially and perhaps color-naming, would, with continued repetition, 
change in their nature. In these tests the same 50 opposites were 
used, the same 17 were added to the same 50 two-place numbers, the 
same names were attached to the same colors. After a number of 
trials these tests would be like responding with a person’s last name 
when the stimulus of his first name was presented. It would be 
interesting and instructive to see how the ‘‘final” position as deter- 
mined by 175 trials in “opposites” and ‘‘adding”’ would correlate 
with other forms of the same test. I do not believe the so-called 
“opposites” and “‘adding”’ tests deserve the name at the end of the 
practice series; nor can the scores determined at that time be called 
measures of final capacity in either naming opposites, or in adding. 
It should be noted that Adding and Opposites are the tests which are 
lowest in the preliminary trial. 

2. If this criticism is justified by the facts, it would suffice to 
account for the increase in intercorrelations as practice continued. 
It would then be unnecessary to posit some “general ability” to 
explain such an increase. 

3. Professor Hollingworth does not mention giving these subjects 
any intelligence ratings or tests. His only criterion of intelligence 
is the composit of the seven ‘‘speed”’ tests. Nor does his composit 
correlate closely with the various tests. The final average is but .49, 
with the smallest at .34 and the largest at .62. Such an average 
correlation, after 205 practices, the last 100 of which saw no change in 
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the position of the subjects, is not warrant for a factor of general 
intelligence. 

4, Thirteen subjects are rather few from which to make generaliza- 
tions. Personal idiosyncrasies in the group may have been sufficient 
to make the results very unusual. 

Because of the difference in method of testing between Professor 
Hollingworth’s and the experiment herein described, the data are not 
contradictory, even though the conclusions are. 

The experiment this paper reports consisted in testing 39 college 
students 25 times on different forms of three tests. In addition they 
twice took a college intelligence test which is given by the university 
to entering freshmen. The subjects, who were volunteers from the 
introductory psychology class, were tested four days a week from 
12 to 12:30. 

The tests! used were the following: 


1. Hard Number-series, Completion.—There were 3 forms devised 
with 25 number-series of increasing difficulty to be completed; and 7 
forms with 35 problems. The test took six minutes. Score, number 
right. 

2. Cancellation Test—A column of eight-lettered words was paired 
with a column of nonsense material seven letters in length. These 
letters included all but one of the letters in the word to the left. The 
subject was to cross out in the word the letter not appearing in the 
group of letters to the rnght. There were ten forms with 150 words 





1 Note: Samples of the three tests used. 


1. Number Series Completion. 


1. 37, 38, 39, 40, ( ) 26. 52,61, 43,52,25 ( ) 
2. 19, 22, 25,27,30 ( ) 27. 48, 63, 93, 148,208 ( ) 
3. 8, 17, 27, 38 ( ) 28. 11,18,4,7,14,0 (_ )ete. 


2. Cancellation Test. 
Directions: cross out in each word the letter which is omitted from the group of 
letters at the right. 


1. Esthetic Echseti Tortured Retuotd February Reabufy 
2. Harmonic Maichrn Displays Sldyias December Mcbeder 
3. American Miranac Endeavor Nvrdeao Borrowed Rwdrboe 
etc. 
3. Multiplication. 
945 968 236 259 247 
62 39 97 76 86 


—_—- _—_. —_—-- —— ———_—- 


etc. 
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each, for this test. Time, six minutes. Score, number completed. 
Errors were negligible. 

3. Multiplication —Three figures were multiplied by two. Neither 
0 nor 1 was used. There were 3 forms of 24 problems, and 7 forms of 
35 problems for this test. Time 5 minutes. Score, number right. 

4. The college intelligence test used was one given to entering 
freshmen the previous year. It correlates with Alpha about .75. Its 
self-correlation is about .93. Itisofthe battery type, containing7 tests, 

There was some question as to whether the tests should cover a 
given length of time, or whether the amount of work done should be 
held constant. When the time is held constant the better subjects 
get a great deal more practice than do the poorer subjects. But if the 
tests were given as group tests, work held constant, the rapid workers 
would become bored waiting for the others to finish. Only the slow 
ones would be kept busy. As there was not time to test the subjects 
individually during their noon hour it was finally decided that for con- 
venience in giving the tests and for the maintenance of live interest on 
the part of the subjects, it would be better to keep the time constant. 

Table I gives the means and sigmas for the three tests. The 
scores are in terms of number right, in the case of number-series and 
multiplication, and in terms of number attempted, in the cancellation 
test. The average for each trial separately, and for the totals of each 
successive set of five trials are given. It is evident that the subjects 
have not reached the limit of their ability in any of the three tests. 
They are, however, beginning to slow up in all the tests, and especially, 
multiplication. It was impossible to practice the group further since 
the quarterended. But the results from even 25 trials are very signifi- 
cant. In Hollingworth’s tests the first 25 trials showed the greatest 
change of any set of similar length. The correlations of the first, 
fifth, or twenty-fifth trial are as high as the correlation for the fiftieth 
trial with ‘‘ultimate capacity” with the exception of Opposites, and 
Coordination (see Hollingworth’s table quoted). Consequently, any 
changes in the direction indicated by Hollingworth’s results should be 
manifest by the 25th trial. Such, however, is not the case. 


crease is much larger, in the case of the faster subjects, the dispersion 
remains practically the same in relation to the mean. Practice neither 
increases nor decreases the relative lead of the faster subjects. 
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TaBLE I.—MEANS AND SIGMAS FOR TRIALS 1 TO 25; AND FOR ToTALs or GRouPS 





























oF FIvE 
Number series Cancellation Multiplication 
A B C 
Trial 
Mean | Sigma Mean | Sigma Mean | Sigma 
1 11.30 | 3.95 1 47.16) 11.32 1 11.02 | 4.31 1 
2 10.76 | 4.57 2 58.80} 11.64 2 14.62 | 4.98 2 
3 12.52 | 4.44 4 61.50) 11.97 3 13.26 | 5.35 5 
4 12.81 3.75 5 65.10) 13.65 4 15.93 | 4.81 6 
5 | 14.00 | 4.42 6 60.35) 17.07 5 13.50 | 5.43 7 
6 13.80 | 5.32 7 59.30} 13.38 7 14.87 | 5.53 8 
7 14.78 | 4.29 8 53.43) 11.82 6 15.55 } 5.05 9 
8 14.88 | 5.16 9 56.73) 12.90 8 | 15.77] 5.21 10 
8) 12.60 | 2.76 10 68.79) 12.57 9 16.06 | 5.02 11 
10 16.63 | 5.30 4 64.42) 16.50 1 15.26 | 5.60 5 
11 16.67 | 4.91 5 70.05) 16.08 2 16.88 | 5.87 6 
12 16.62 | 5.34 6 76.10} 16.59 3 15.78 | 5.44 7 
13 16.85 | 6.32 7 79.52) 16.60 4 16.68 | 6.34 8 
14 16.73 | 4.53 8 73.28) 17.24 6 16.98 | 5.99 9 
15 17.39 4.89 8) 62.28) 14.00 7 17.03 6.36 10 
16 | 14.03 | 3.06} 10 | 72.58|19.76| 5 | 17.24} 6.21) 11 
17 18.42 4.88 4 70.61) 15.20 8 18.11 §.74 | 4 
18 18.42 5.15 5 65.56) 15.20 6 15.80 5.84 5 
19 18.52 6.10 7 78.53) 18.89 7 18.90 5.74 7 
20 17.91 6.70 8 74.89) 17.05 5 17.29 | 5.77 9 
21 15.01 4.43 10 71.92) 17.91 8 17.47 6.99 8 
22 18.67 6.11 9 74.28) 17.47 6 17.11 6.19 11 
23 18.98 4.71 8 78.10) 20.85 5 19.44 5.60 4 
24 20.50 | 6.06 6 85.00) 20.29 3 18.44 | 6.67 10 
25 20.11 | 7.52 7 79.33) 18.19 2 17.29 5.57 9 
Totals | 
1-5 59.50 | 19.45 ¥ 291.70) 62.21 es 66.75 | 23.15 
6-10 | 71.50 | 20.90 ie 302.50) 61.90 as 75.30 | 25.10 
11-15 | 82.48 | 24.96 | .. | 353.30) 75.80] .. | 81.60 | 29.10 
16-20 | 86.00 | 23.85 + 361.93) 82.00 7 85.06 | 27.40 
21-25 | 91.45 | 26.96 ea 389.70 92.10 sis 88.00 | 28.50 
| 




















The number after the sigma indicates the number of the form of the test used. 
The first three forms of the number series and multiplication tests had only 25 
problems; the remainder had 35. 
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TaB_e I].—CorrFricieNnts OF VARIABILITY FOR THE SEPARATE TRIALS 1 To 25: 
’ 


AND FOR TOTALS OF GROUPS OF FIVE 
































1 2 Number series Cancellation Multiplication 
rial 
A B Cc 

1 34.95 1 24.00 | 1 39.21 1 

2 42.47 2 19.79 2 34.06 2 

3 35.47 4 19.46 3 40.34 5 

4 29.27 5 20.12 4 30.19 6 

5 31.57 6 28.28 5 40.32 7 

6 38.55 | 7 22.56 7 37.28 8 

7 29.02 8 22.12 6 32.37 9 

8 34.67 9 22.73 8 33.13 10 

9 21.90 10 18.27 9 31.25 ll 

10 31.87 4 25.61 1 36.72 5 

11 29.45 5 22.95 2 | $4.77 6 

12 32.12 6 21.80 3 | 84.47 7 

13 37.50 | 7 20.87 4 | 38.00 8 

14 27.17 8 23.52 6 35.27 9 

15 28.11 9 22.47 7 | 87.34 10 

16 21.81 | 10 27.22 5 36.02 11 

17 26.49 4 21.52 8 31.69 4 

18 27.95 5 23.18 6 36.96 5 

19 32.93 | 7 24.05 7 30.37 7 

20 37.40 8 22.76 5 Udi lhl. 9 

21 29.51 10 . 24.90 8 40.01 8 

22 32.73 9 23.51 6 36.27 11 

23 24.81 8 26.69 5 28.80 4 

24 29.56 6 23.87 3 36.27 10 

25 37.39 7 22.92 2 32.21 9 
Average | 31.38 23.00 | 35.07 

Totals 

1-5 | 32.68 21.32 34.68 
6-10 29.23 20.48 33.33 
11-15 30.26 21.45 | 35.66 
16-20 | 27.73 22.65 | 32.21 
21-25 | 29.48 | 23.63 | 32.38 
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TaBLeE IJ].—RewiaBmity CoErricients, INTERCORRELATIONS, AND CORRECTED 
CoRRECTIONS BETWEEN NuMBER Series A, CANCELLATIONS B anD MULTI- 


PLICATIONS C TRIALS SEPARATELY AND TOTALED IN Groups oF FIVE 





































































































Reliability coefficients Raw intercorrelations ae yaar 
Trial | A B/C Trial | A:B/B:C|A:C| A:B| B:C|C:C 
1-2 77 861.80 1 .88| .387| .38| .47 45 49 
2-3 78|.85'.81| 2 | .28| .38| .45| .34 | .45 | .56 
3-4 75 85).84 3 .53| .42| .40| .66 50 51 
4-5 .78|.77|.84| 4 31 | .56| .38| .40 .70 47 
5-6 .87|.91 84, 5 .57 | .56| .34| .64 65 .40 
6-7 .80|.73).76| 6 .52| .50| .40| .61 .67 54 
7-8 .76|.76|.81| 7 .40| .48| .25| .52 61 31 
8-9 .74|.80|.78| 8 .47| .44| .42| .61 56 55 
9-10 .67|.88|.81| 9 .80| .55| .14| .39 65 19 
10-11 ~—_|.80}.82).90} 10 44) .53] .57| .54 .62 .68 
11-12. 86). 88/.90} 11 .87| .63| .41 | .48 72 47 
12-13 | .90).75).92} 12 | .57| .51| .52| .70 | .62 .58 
13-14 .90|.90|}.89} 13 .50| .51| .45| .56 57 50 
14-15 .89|.86|.87) 14 .70| .47| .53| .80 .58 64 
15-16 .74|.88|.90| 15 .62| .40| .51| .71 49 63 
16-17 .74|.93|.92| 16 .51| .30| .20| .62 32 24 
17-18 .82).87|.81| 17 63 | .42| .60| .74 .50 73 
18-19 .87|.91|.84| 18 .67 | .88| .388 | .75 43 44 
19-20 .81|.90|.88) 19 .45| .58| .46| .52 65 54 
20-21 .81|.93].89) 20 63 | .54! .50]| .70 59 58 
21-22 .79|.83|.88) 21 .61| .52| .50] .75 .60 59 
22-23 .84|.85|.87| 22 .66 | .52| .36] .78 60 42 
23-24 .80|.83|.87| 23 62} .41| .51 | .76 48 61 
24-25 .87|.82).82) 24 61; .54| .51 | .72 64 .60 
25 .64| .39| .50| .74 47 59 

Averages of the Above Correlations 
1-5 .79} .85|.83 41] .45|] .39] .50 55 .48 
6-10 .75).80).81 .43 | .50] .35) .53 62 45 
11-15 . 86) .85).89 .55| .50| .48| .64 59 56 
16-20 .81|.91).87 .58| .48| .43| .67 50 51 
21-25 .82|.83).86 .63 | .47| .47| .75 .56 56 

Correlations of Total Scores 

1-5 & 6-10) .92 91.96 1-5 | .53| .56| .47/ .61 63 51 
6-10 & 11-15).95|.94'.94) 6-10) .61| .53] .49]| .64 57 52 
11-15 & 16-20).96).96.91| 11-15, .62| .57| .55| .65 | .61 59 
16-20 & 21-25 .94|.88).95) 16-20) .11 | .59| .61 | .67 64 64 
| | | 21-25) 71) .50) 61 | .77 54 .64 
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Table III gives the raw reliability coefficients, the raw intercorrelg- 
tions and the intercorrelations corrected for attenuation by the 


formula 
yl.2 


cq: = 
— V7 T21 
The correlations shown in Table III are, however, misleading, 
They are not strictly comparable for the reason that the dispersions 
are not the same. A reliability of .77 for the first two trials is not the 
same as a coefficient of .77 for the last two trials due to the difference 
in the size of the sigmas. The coefficients of alienation for the two 
correlations will be the same, but the standard error of estimate will 
be larger for the measure with the greater sigma. A formula for 
expressing the predictive value of one reliability coefficient in terms of 
the sigma of another distribution of same test is provided by Kelley.! 
The formula is 
an V1 — Ray 


2° V1i— yy 
The reliability of the tests for the first trial was A, .77; B, .86; C, .80. 
The equivalents for this initial reliability converted into terms of the 
disttrbution of trials 5, 10, 15, 20 and 25 are found in Table IV. 





TaBLE IV.—EQUIVALENTS OF TRIALS 1 IN TERMS OF SIGMAS OF OTHER 











DISTRIBUTIONS 
Trial Number series | Cancellation Multiplication 

ria A B C 

1 77 .86 .80 

5 . 84 .95 .89 

10 .89 .94 .90 

15 .87 .92 .94 
20 .92 .99 | .89 
25 .93 .94 .88 











With this correction in mind, we may now turn to the consideration 
of Table III. The raw reliability coefficients increase slightly in the 
case of all three tests, for the separate trials: but the coefficients for the 
totals of the five trials do not show any marked increase. Such a 
sweeping generalization can not be made about the intercorrelations. 
Number series correlated with cancellation show a substantial rise both 





1 Statistical Method, pp. 221-223. 
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in the individual trials and in the correlations for the total of five 
trials. It starts out slightly lower than the correlations of cancellation 
with multiplication, but passes them somewhere around the twelfth 
trial. Cancellation correlated with multiplication maintains a 
constant throughout. The first correlation is .37, and the twenty-fifth 
is .39. The correlation of number series with multiplication holds a 
midway position between these other two; it increases slightly, in the 
separate trials, though the increase is more marked in the correlation 
of the totals of five trials. From these 25 trials on number series, 
cancellation, and multiplication, then, we may say that no generaliza- 
tion is possible as to what intercorrelations on other tests will be after 
practice; but that in the tests used in this experiment Number Series: 
Cancellation increases significantly; Number Series: Multiplication 
increases slightly; Cancellation: Multiplication shows no definite trend 
either up ordown. ‘The correlations do not approach 1.00 nor do they 
approach .00. 

To determine how accurately the first rough snap-shot of ability 
predicts ultimate capacity in these tests the first trial was correlated 
with the fifteenth, twentieth, and twenty-fifth: the total of the first 
three trials with the total of the thirteenth to fifteenth trials; and the 
total of the first to fifth trials with the totals of the sixteenth to twen- 
tieth, and twenty-first to twenty-fifth trials. The results are shown 
in Table V. 


TaBLE V.—CoORRELATION OF THE First TRIALS WITH THE Last TRIALS 








Tri Number-series Cancellation Multiplication 
rial 
A B C 

1:15 .89 .74 81 
1:20 .85 .64 .79 
1:25 .77 64 .85 

1-3 : 13-15 87 84 .90 

1-5 : 16-20 . 86 .86 .95 

1-5 : 21-25 .88 .83 .93 














The correlations in Table V are surprisingly large. All of them are 
within range of the consecutive reliability coefficients with the excep- 
tion of the single trials for the cancellation test. The lowness of these 
correlations might be explained by the fact that after eight or nine 
trials a number of the class learned an improved method of performing 
the test. According to some of their reports, they would scan, first, 
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the whole word, and second the entire group of accompanying letters 
perceiving at a prolonged glance, the missing letter. The others eet. 
tinued to search out each letter of the word in the group of letters at 
the right. The learning curve for the cancellation test points to the 
same thing. There is a sudden rise at about the ninth trial, and 
another about the ninteenth or twentieth. All the subjects had not 
achieved this method when the experiment was completed. However, 
despite the change in the function of the cancellation test, the initia] 
trial predicts with fair reliability the relative positions of the subjects 
after considerable practice. 

The comparison of the various trials with the criterion of intelli- 
gence will be found in Table VI. 


Tasie VI.—CorRELATIONS OF Every Firta TRIAL AND ToTALS FoR Eacu Frvg 
Sets oF TRIALS witH TotaL or Two TRIALS ON COLLEGE INTELLIGENCE 

















TEstT 

Number Cancella- | Multipli- || Number | Cancella- | Multipli- 

Trial| seven tion cation | | Trial seven tion cation 
A B Cc | A B C 
1| .44 43 | 112 || 15 55 46 25 
5 .65 44 10 || 6-10 .63 .35 22 
10 | .37 33 21 |) 11-15 58 47 24 
15 63 51 .27 16-20 | .71 .46 22 
20 .63 .48 18 || 21-25 | .64 .49 24 

25 .63 .50 AS | | 




















The two trials of the college test correlated .924. 


There is no increase in the correlation of these tests with intelligence 
except possibly in the case of number series. Number series, which 
correlates highest in every case, increases only from the first, or initial 
correlation. The last four, for individual trials, and for the totals, do 
not increase. The correlations with intelligence tests are: For number 
series, about .63, for cancellation, from .40 to .50, for multiplication, 
from .15 to .25. 

It remains to consider sex differences. Considering the few cases, 
little can be claimed for the results, except that they suggest what 
might be the case. There were but 15 women and 24 men. (Holling- 
worth had but 13 subjects in his experiment.) From a study of the 
comparative means (Table VII) it was found that the means for intelli- 
gence scores were the same, that the men excelled consistently in both 
number series, and multiplication; but in the cancellation test, despite 
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TapLtE VII.—MEANS AND APPROXIMATE SIGMAS FOR MEN AND WOMEN ON THE 
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Number series Cancellation Multiplication 
. A B C 
Trial 
Mean | Sigma | Mean | Sigma | Mean | Sigma 
1-5 

Ee er 65.3 | 20 287 66 68.5 21 
Se eee 51.5 | 16 298 52 62. 25 

6-10 
tae kee ae 74.8 301 78.9 
RR Se 62.6 304 68 .6 

11-15 
ER eee 84.1 | 26 348 82 83.3 21 
Rs ie ke edebeeat 74.4 | 23 360 62 a 35 

16-20 
See eer 91.4 356 88.5 
a as oa a aout 76.1 371 78.6 

21-25 
DN Che swerve ea sk¥er 96.3 | 27 387 100 90.5 26 
I dusk Selteabiate 83.2 | 23 391 71 82. 29 

College test 

annie sm biaiaeal 213. 66.7 
acai cnidice seca 215. 49.5 








TaBLE VIII.—Srexes SEPARATE AND COMBINED, FOR TOTALS OF TRIALS 1-5, 


11-15, 21-25; INTERCORRELATED, AND CORRELATED WITH INTELLIGENCE 














RATINGS 
Intercorrela- . Number 
: Cancellation , 
tions A:B ; seven: Correlations 
: and multi- er Ee s ; 
, number series} _,. ,. _qc | multiplication intelligence 
Trials , plication B:C ; 
cancellation A:C 
M/F]! Both |M!/|F | Both | Mj! F| Both | Trial |M/}F| Both 
1-5 .54|.71| .53 |.53].60| .56 |.48|.50| .47 1— 5A |.47|.93) .55 
B |.51).48) .46 
11-15 |.69|.64| .62 |.65).62| .57 |.54/.53) .55 C |.09|.49| .25 
21-25 |.95|.75| .71 |.42).63) .50 |.52).63) .61 | 11-15A |.69).81) .58 
































B |.44/.51| .47 
C |.29).30) .24 


21-25A |.62|.89) .64 


B |.51|.67| .49 
C |.17/.50} .24 
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the fact that one man led the group by at least 25 points every trial, 
the average for the women is better. A study of the sigmas reveals odd 
material. The dispersion for the men is greater in both number series, 
and cancellation, but not in the multiplication test. The comparative 
increase in dispersion for cancellation is regular for both groups, but 
that for the men increases much more rapidly. The multiplication 
data would indicate that around the fifteenth trial; the rapid multiply- 
ing women reached the peak, the increase in mean and decrease in 
sigma being due to the continued progress of those slower people who 
have not had as many problems on which to practice. The men do not 
seem to have reached the highest point in their course until some five 
or ten trials later than the women. Some correlations were worked out 
for the separate sexes on the hypothesis that the correlation would be 
much higher, due to the inequality of the sexes on the different tests. 
For instance, in correlating, say number series with cancellation, sexes 
combined, the women would probably populate one side of the regres- 
sion line, and the men the other side. The same result should be 
manifest in correlating cancellation with multiplication but, since 
the men lead in both number series and multiplication, the intercorrela- 
tions of these tests should not be affected. And further, the correla- 
tion of all the three tests with intelligence ratings should be slightly 
higher for the two sexes. The correlations for the sexes, combined and 
separate, are in Table VIII. 

The figures bear out the anticipated results fairly completely. 
The intercorrelations of number series with cancellation for the sexes 
divided is higher in every instance than that for the total group, 
though the differences are not consistently significant. In the case of 
cancellation and multiplication, the correlations for the women are 
consistently higher than the coefficient for the total population; but 
the men correlate lower in these two functions, two out of three times. 
The correlations between number series and multiplication came out as 
expected, showing practically no differences. Here again, however, 
the correlations for the women are higher than those for the men. 

The correlations of the tests with intelligence show both men and 
women are much higher separately than combined in number series 
with intelligence. In the cancellation tests correlated with intelligence 
the sexes separate are higher than combined with but oneexception. In 
multiplication correlated with intelligence, the women are significantly 
higher than the combined group, while the men show practically no 
correlation between the two functions, being in two cases less than .20. 
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An unexpected result was the high correlations for the women. 
Out of the 18 correlations for the sexes, the women are higher than the 
correlations of the total group in all but 1, and higher than the corre- 
lations for the men in all but 5; and of these 5, 1 only is greater than .05. 
The correlations of the men are higher than those for the group in 10 
of the 18 possibilities. These differences are not to be explained 
merely in terms of the greater dispersion for the men, or for the total 
group; for in the case of multiplication the dispersion of the women is 
much greater than that of the men, and yet it is here that the men 
correlate lowest with intelligence. 

The conclusions of this experiment are: 

1. The dispersion of the group remained practically constant 
throughout the practice on 25 trials. 

2. Extensive practice on different forms of various tests tends to 
increase the correlations, number-series with cancellation, and to a 
less extent, number-series with multiplication; but not the correlation 
of cancellation and multiplication. There seems to be no constant 
tendency in all tests. 

3. The first trial of the tests predicts with a high degree of certainty 
the later positions in that test. 

4. Five totals for each test, five trials at a time, correlate practically 
without change with a criterion of intelligence. Number series may 
be a possible exception, increasing immediately from about .55 to 
about .65. Cancellation stays at about .47, while multiplication 
seems fixed at .24. 

5. Women do better than men in cancellation test. The dispersion 
of the women, however, is greater than that of the men only in multi- 
plying. Intercorrelations for the sexes separate are higher in number 
series with cancellation; cancellation with multiplication; both number 
series and cancellation with intelligence. The correlations for the 
women are consistently higher than those for the men. 

6. Professor Hollingworth’s conclusions that initial trials do not 
indicate what final capacity will be; and that intercorrelations increase 
with practice, due to a more accurate tapping of “ultimate capacity”’ 
or “general ability’’ are not borne out by Tables III, IV, V and VI. 
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A SURVEY OF THE FIELD OF CLINICAL 
PSYCHOLOGY IN NEW YORK STATE}! 


ELEANOR E. BOYAKIN 
Long Beach City Schools, Long Beach, California 


In recent years there has been a considerable amount of literature on 
the subject of the value of psychological tests in schools, courts, reform- 
atories, hospitals, etc. and one is thus led to believe that there is 
much work being done in this field. With much optimistic enthusiasm 
the writer set out to determine the extent of psychological testing in 
the state of New York, the location of the clinics, the type of individual 
served (whether children, adolescents or adults) and whether the 
examiners giving the tests were adequately and scientifically trained. 

The first requisite for such a survey was a list of clinics or psycho- 
logical departments to which a questionnaire could be sent, but it was 
not long before it was discovered that such a list had never been com- 
piled. Thus it became necessary to interview, or correspond with, 
representatives of practically all the city and state departments, 
educational bureaus, social agencies and other associations that would 
be likely to know of the existence of such clinics or departments. In 
this way a tentative list of about 275 different places was gathered. 
A copy of a questionnaire containing 22 questions was then sent to 
each. 

It is due to the hearty response and cooperation of these depart- 
ments and associations accompanied by numerous requests that a copy 
of the report be sent them when completed, that a brief resumé of the 
survey is now published. Owing to limited space in the Journal, only 
a very brief report of the results of the survey can be given.” 





1 Grateful acknowledgment is due to Professors A. T. Poffenberger and Leta S. 
Hollingworth of Columbia University and Dr. Ethel Cornell of the State Depart- 
ment of Education, Albany, for helpful suggestions and encouragement. The 
writer wishes to thank all the busy executives, psychologists, physicians, teachers, 
judges and others who not only took time to fill in the questionnaire, but who added 
the names of other psychological departments. 

2 It is difficult to state from the returns just how many genuine psyehological 
clinics there are in New York state, but the following places have what resembles, 
in some respects, at least, such a clinic. Rather extensive examinations are given 
either by the psychologist, if one is on the staff, or by a psychiatrist who is familiar 
with the technique of giving mental tests in the following towns and cities: 

Albany, Buffalo, Canaan, Corning, Ellis Island, Elmira, Hornell, Hudson, 
Ithaca, Napanoch, North Tonawanda, Ossining, Rochester, Schenectady, Staten 
Island, Syracuse, Thiells, and White Plains. 
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The New York State Commission for Mental Defectives conducts about 
60 monthly clinics throughout New York state and one traveling clinic. 
These clinics are under the direction of a psychiatrist, but the psycho- 
logical testing is done by five field agents, or psychometric examiners 
who do the social work as well. The Commission handles only the 
small cities and towns, averaging at present only about 4000 cases per 
year, nearly all of which are school children. V.C. Branham, M.D. is 
head psychiatrist and Katharine G. Ecob is Chief Field Agent. 

The State Hospital Commission conducts a series of psychiatric 
clinics, but there are few psychological tests given, as cases suspected 
of being feebleminded or of needing a psychological test to determine 
vocational placement are turned over to the State Commission for 
Mental Defectives for such examination. The latter Commission, in 
turn, sends its psychotic cases to the former Commission for 
examination. 

The Educational Measurements Bureau, State Department of 
Education, Albany, does considerable examining, chiefly in the smaller 
cities and villages which are organizing special classes and where there 
is no person competent to doit. Ethel L. Cornell, Ph.D. psychologist 
of the State Department of Education, devotes part of her time to this 
work and receives some assistance from Jacob S. Orleans, Research 
Associate. Warren W. Coxe, Chief of the Bureau, occasionally does 
some individual examining if the pressure of work permits. 

Record of Results ——The results are very inadequate and limited 
as they are compiled from incomplete returns. Many of the questions 
were not answered because such statistics had never been kept. Also, 
it was strikingly noticeable that in many cases the questionnaire fell 
into the hands of the psychiatrist, who happened to be the director and 
who willingly filled in the questions, so far as he was able, from a medical 
point of view, whereas it was primarily intended for the psychologist, 
or psychometric examiner. 





In New York City work in this field is being conducted in the following institu- 
tions and departments: Bureau of Children’s Guidance, Catholic Charities Mental 
Clinic, Children’s Court, Mt. Sinai Hospital, Children’s Health Class, Children’s 
Hospital at Randall’s Island, Department of Correction (Blackwell’s Island). 
Department of Public Welfare (Bellevue Hospital) Union Theological Seminary, 
Educational Clinic of New York, Institute of Child Welfare Research, Lenox Hill 
Hospital, Family Welfare Clinic of A. I. C. P., Service League of America, Hudson 
Guild, Vanderbilt Clinic, Board of Education, Neurological Institute, St. Luke’s 
Hospital, Post Graduate Hospital, U. S. Veteran’s Hospital, Vocational Adjust- 
ment Bureau for Girls, Vocational Service for Juniors, the Psychological Corpora- 
tion. (Further details may be had upon application to the author.) 





* 
i 
} 
ie 
: i 
'y 


= a ee 
OO et Et ae ee Oe eee 























a 


404 The Journal of Educational Psychology 


Although no definite conclusions can be drawn, in a general way 
certain interesting facts are brought out and various tendencies noted. 
Only 40 questionnaires (20 from outside New York City and 20 from 
within the city) are included in the tabulations. Those clinics that 
were definitely psychiatric or under the direction of the State Hospital 
Commission or State Commission for Mental Defectives were not 
included. Only a very few of the 22 questions can be mentioned here. 


Question 4. Clinic how long in existence? Range 1 monthto l4 years. Median 
4 years. 

Question 5. The degrees held by the Directors of the difference clinics are: 
In New York City: M.D., 9; Ph.D., 4; M.D. and Ph.D., 1; M.A, 2; 
B.A., 0. 
Outside New York City: M.D., 12; Ph.D., 2; M.A., 1; B.A., 0. 

Other psychologists on the staff, besides the Director, hold the 

following degrees: 
In New York City: Ph.D., 6; M.A., 11; B.A., 2; degree not men- 
tioned, 15. 
Outside New York City: Ph.D., 1; M.A., 6; B.A., 6; Degree not 
mentioned, 3. 
Thus we have 1 M.D. and Ph.D.; 221 M.D.’s; 13 Ph.D.’s; 20 M.A.’s 
8 B.A.’s and 18 degree not mentioned, making a total of 81 people 
employed in clinical work in New York state. 

Question 10. Approximate number of cases handled by clinic: 


Feebleminded (below 70 IQ)............ 27 per cent 20 per cont 
Borderline (70-79 IQ).................. 22 per cent 
id ai aoa 5s 5 sy ae a ae ROD 24 per cent | 

Normal (90-109 IQ)................... _23 per cent > 51 per cent 
Superior (110 IQ and over)............. 4 per cent 


\ 


Question 14. Boys 45 per cent ; 53 per cent Girls 34 per cent 


Men _ 8 percent Women 13 per cent 

Question 17. Requirements necessary for appointment on staff. Eighteen 
different requirements were listed. Those mentioned most fre- 
quently were: 


47 per cent 
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This seems to indicate that experience in clinical work and personality are 
considered the chief requisites for a clinical psychologist, whereas teaching experi- 
ence is evidently not given much weight. Training in psychiatry and neurology 
were deemed important enough to be added to the list by three different clinics. 
Question 19. Is a thorough physical examination given to each case? Yes, 32; 

No, 4; Partial, 2. 
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Question 20. 


Question 21. 


Question 22. 
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Is a complete social history of each case obtained? Yes, 27; No, 5; 
Partial, 3; When possible, 2; Usually, 1. 

Fee charged? No, 28; Yes, 2; Not to public school children in 
New York City; 25 cents per visit, 2; 50 cents, 1; free clinic work, 
but private cases $1 from private clinic and $25 for outside cases, 1; 
contemplating scale of prices $1 to $50 depending on case, higher 
prices for more intensive work and on private cases. 

Tests used: 

Stanford-Binet (39), Porteus Maze (17), Pintner-Patterson 
Performance (16), Healy Completion Nos. 1 and 2 (10), Herring 
Binet (9), Stenquist Mechanical Aptitudes (9), Kelley-Trabue 
Completion (6), National Group Intelligence (6), I. E. R. Toops (4), 
Kuhlmann Binet (4), Ayres-Buckingham Spelling (3), Grays 
Reading (3), Haggerty Group Intelligence (3), Healy Construction 
Puzzle A (3), Knox Cube (3), Otis Group Intelligence (3) Pressey 
XO (3), Stanford Achievement (3), Stenquist Assembling (3) 
Thorndike-McCall Reading (3), Woodworth-Wells Hard Directions 
(3), Cancellation letter (2), Dearborn Form Board (2), Detroit First 
Grade (2), Haggerty Reading (2), Healy Tapping (2), I. E. R. 
Clerical (2), Jung Association (2), Kohs Block Design (2), Mare and 
Foal (2), Monroe Reading (2), New York State Hospital Guide 
Tests (2), Pintner Non Language (2), Sequin Form Board (2), 
Stone Arithmetic Reasoning (2), Symbol Digit (2), Wallin Peg 
Board (2), Woodworth-Wells Easy Directions (2), Woody-McCall 
Mixed Fundamentals (2), Army Alpha (1), Army Beta (1), Burr 
Monotony Test and Worsted Sorting (1), Burr Sewing and Millinery 
Tests (1), Casinet Form Board (1), Courtis Arithmetic (1), Cube 
Construction (1), Detroit Kindergarten (1), Downey Will Tempera- 
ment (1) Family Welfare Clinic Practical Knowledge Test (1), 
Hayes Revision of the Binet for Blind (1), Healy Learning (1), 
Holtz Algebra (1), Kansas Arithmetic (1), Kansas Silent Reading 
(1), Meyers Pantominal (1), Meyer Group Intelligence (1), Morri- 
son-McCall Spelling (1), Motor Coordination (1), New York State 
Literacy Test (1), Pintner Educational (1), Pressey Group Intelli- 
gence (1), Psychoanalysis (1), Puzzles not standardized (1), Roback 
Superior Intelligence (1), Thorndike Intelligence (1), Thurstones 
Clerical and Typing (1), Triangle Form Board (1), Union Theologi- 
cal Discrimination (1), Union Theological Religious Ideas (1). 


Three clinics stated that their list of tests was too extensive to mention. 


CONCLUSION 


1. There is no centralizing agency for psychological examinations. 
Cases are referred to the various Mental Hygiene Clinics, or to those 
conducted by the State Commission for Mental Defectives, the school 
departments, different private or public agencies, hospitals or private 
individuals. 

2. There are very few psychological clinics, as such, in the state of 
New York where exhaustive examinations are given, but there are a 
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great many psychiatric clinics and in nearly all of these the Stanforg 
Revision of the Binet-Simon test is given either by the physician, who 
is the director, or by a psychometric examiner. 

3. A large percentage of the psychological testing is done by the 
field agents of the State Commission for Mental Defectives, as this 
Commission conducts about 60 monthly mental hygiene clinics in the 
small towns. In addition, the State Hospital Commission conducts 
many psychiatric clinics, but it is only occasionally that the physicians 
in charge of these clinics give a mental test, as all cases needing such 
an examination are referred to the former commission. 

4. One outstanding fact brought out, by this survey is the sporadic 
attempts at measurement made by psychological testing, often by 
untrained people who know little if anything concerning standardized 
psychological procedure. We find some physicians and psychiatrists 
giving mental tests, we have only a very hazy idea as to what the 
terms “‘mental age” and “‘IQ” really mean. The writer has in mind 
a certain physician in one of the state institutions, who appeared to 
be at a great disadvantage in interpreting or even understanding the 
significance of the Binet-Simon tests; yet he claimed to have trained 
others under him to give this test to the inmates. Then, too, in many 
instances, we find teachers who have had only one or two courses in 
psychology administering the necessary tests in the schools, and still 
other individuals being employed as psychometric examiners who hold 
no university degree whatever. 

5. It would appear from the results of the survey that, for the most 
part, the psychological profession is one that is greatly underpaid. 
Much of the psychological work is done by volunteers, many of whom are 
trained psychologists holding higher university degrees. Infact, many 
psychologists with Ph.D. degrees are giving their services gratuitously 
for the benefit of humanity, while the rest do not even receive what an 
ordinary high school teacher naturally expects. There is no legitimate 
reason why the psychologists should not expect and demand adequate 
remuneration for services rendered just as physicians and lawyers do. 

6. There are nine cities in which the school departments have what 
resembles, to some extent at least, a Child Study Laboratory and 
where a psychologist is employed. These cities are Albany, Buffalo, 
Elmira, Ithaca, New York, North Tonawanda, Rochester, Schenectady 
and Syracuse. None of these psychologists hold the Ph.D. degree, 
though one is listed among the qualified examiners in mental defect. 
In some of the other cities the school departments either depend upon 








the 
tral 


t+ od bo 


aor chp ew O 


— we ao. aoa er. 














The Field of Clinical Psychology 407 


the help of outside agencies or the testing is done by a specially 
trained teacher. These cities are Auburn, Binghamton, Corning, 
Cortland, Jamestown, Lockport, Mt. Vernon and New Rochelle. 
There are 239 special classes, not including those in New York City. 

7. Buffalo, Brooklyn and New York City are the only cities in 
which the Children’s Court makes considerable use of the services of 
a psychologist. Six or seven other Juvenile Courts recognize the 
value of a psychological test in extreme cases only and so, occasionally 
in these cities by order of the Judge, a child is referred to an outside 
agency for a mental diagnosis. 

8. There is an over-whelming proportion of recognized psycholo- 
gists located in New York City. Out of a list of 145 psychologists in 
the state, who are either members of the American Psychological 
Association or are registered as qualified examiners in mental defect, 
101 reside in New York City. Only a few of these are engaged in 
clinical work, most of them being connected with the different univer- 
sities. To this number should be added an additional 30 who are 
trained psychometric examiners, or psychologists, devoting their full 
time to testing in the various clinics in the city. These figures perhaps 
explain, in a measure, why there is probably more competition among 
the profession in New York City than in any other city in the world. 

9. There are reported on the questionnaires 10 Ph.D.’s, 13 M.A.’s, 
2 B.A.’s and 15 others not mentioning the degree, actively engaged in 
psychological clinic work in New York City. This does not include 
those working in connection with the universities or with the Institute 
of Child Welfare Research of Teachers College. Outside of New York 
City, scattered throughout the state, there are 3 Ph.D.’s 7 M.A.’s, 
6 B.A.’s and 3 not mentioning the degree, devoting their full time to 
this work. 

10. Psychiatrists, holding the M.D. degree, are principally the 
directors of the clinics surveyed, though 6 Ph.D.’s and 3 M.A.’s have 
been given the directorship. All the psychiatric clinics conducted by 
the State Commission for Mental Defectives and by the State Hospital 
Commission are also in charge of a psychiatrist. The psychologist 
merely assists him and is directly under his supervision. 

11. Contrary to popular opinion, especially among the medical 
profession, that only the feebleminded require a mental test and are 
brought to the attention of the psychologist, we find from the clinics 
reporting that they have dealt with as many superior, normal and 
backward children as borderline and feebleminded. 




















THE MENTAL ATTAINMENTS OF COLLEGE 
STUDENTS IN RELATION TO THE 
PREPARATORY SCHOOL AND 
HEREDITY 


JOHN W. GOWEN AND MARJORIE GOOCH 


University of Maine 


Ever since man began to live in an organized fashion there have 
been educational systems of more or less influence. Those of greater 
influence have largely centered around a certain few individuals who 
have occupied unique positions in the history of thought. The 
Greek philosophers like Plato, Socrates, and Aristotle in ancient history, 
and men such as Pestalozzi, Pasteur, and William James in more 
modern times have furnished the nuclei around which were built 
schools of tremendous significance. The pupils caught the inspira- 
tion, transmitted it, and furthered it. The public has attributed the 
brilliance of these students to association with the masters, believing 
in other words that it was the environment that made the students 
great. But it is equally as apt to be the case that the power and pres- 
tige of the master attracted the pupils who were inherently of superior 
mental worth and ability. It is also true that the influence of the 
great individual teachers had a wide scope partly because of the fact 
that they were not restricted by mechanical details and routine. Their 
distinctive feature was not actual subject-matter so much as personal 
contact and inspiration. 

While the influence of the high school teacher is not of such inter- 
national import, there exists a real problem as to the actual effect of 
such incentive. We are accustomed to the idea that the personnel 
of a preparatory school is a determining factor of its usefulness. On 
this hypothesis, one might assume that the larger high schools with 
their better equipment, broader curriculum, and in general, better 
paid teachers would produce better college students. 

This question of relative merits of large and small high schools is 
particularly pertinent here in Maine since the majority of the high 
schools throughout the state are small, having but two or three teach- 
ers. The widely scattered and scanty population of the state seems, 
at present, to make the small school inevitable, and it is probably as 
inevitable that the system should be widely criticised. So that any 
actual data which may be collected showing concretely the effect of a 
high school’s training on its pupils’ subsequent college work, throw 
light on a problem of immense significance. 
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METHODS 


The data upon which this paper is based are the freshman college 
and high school records of 927 of the graduates of the University of 
Maine. The method of gathering the data and the statistical treat- 
ment are more fully explained in the first paper of this series.! 

In approaching this problem of the effect of large and small high 
schools on subsequent college work, only those high schools were 
considered which sent two or more students who later graduated from 
the University between 1913 and 1921 inclusive. These high schools 
were divided into two groups based on their average daily attendance 
as shown in the State Superintendent’s report for 1915.2, One group— 
the large high schools—includes all those schools which in this report 
have an average daily attendance of 100 or over; the second group—the 
small schools—consists of those high schools having less than 100 for 
an average daily attendance; the third group—the academies—com- 
prises those semi-private institutions which supplement our high 
schools. Means, standard deviations, and correlation coefficients were 
figured on this basis.* 


Tue Errect OF THE HiagH ScHOOL ON LATER COLLEGE WoRK 


By such a method as that already indicated we can readily compare 
the average college ranks of pupils coming from the small high schools 
with those from the large high schools or academies. The range of 
years is such that it would tend to counteract, somewhat, the effect 
of one set of teachers, yet if the general environment and standards 
of an individual school are at all permanent they should show their 
effect on the later college work. If this effect is significant it would 
show that an earlier environment might have its effect on later mental 
work. 

The average ranks for the college subjects English, chemistry, 
algebra, analytical geometry, advanced French, and elementary 





1Gowen, John W. and Gooch, Marjorie: The Mental Attainments of College 
Students in Relation to Previous Training. Journal Educational Psychology, 
Nov., 1925. 


2 Report of the State Superintendent of Public Schools of the State of Maine for 
the School Year ending June 30, 1915. 


* Harris, J. A.: On the Correlation of Intra-class and Inter-class Coefficients 
of Correlation. Biometrika, 1913, Vol. 9, pp. 446-472. 
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German, when the high schools are divided into large high schools, 
small high schools, and academies are shown in Table I. 


TABLE I.—AvERAGE CoLLEGE RANKS OF STUDENTS FROM PREPARATORY ScHootsg 


























? Analytical | Advanced | Elementary 
English | Chemistry| Algebra quecneies acon Guinan 
| = a Le ee ee ; ee 
Large high schools.....| 82.6+.3 | 80.1.4 | 85.4+ .5) 80.9+ .5) 88.9+ 5| 83.4+ .6 
Small high schools.....| 82.0+.7 | 82.94+.7 | 82.44+1.1] 83.84+1.0].......... 82.1+1.0 
Academies............ 80.5+.5 | 79.5+.6 | 78.3+ .9| 79.34 .8 78.841.3) 82.2+ .8 





Three out of five cases, those of English, algebra, and German 
show a higher average in the larger schools than in the smaller schools. 
In one case, that of advanced French, the small high schools are appar- 
ently entirely inadequate, providing so little French as to make it 
impossible for more than very few of their pupils to take advanced 
work in college. In all these circumstances the large high schools show 
better college rank than do the academies. This is somewhat 
surprising in view of the academies’ financial standing, which might 
be expected to provide better teachers and equipment. 

While in the majority of cases the large school seems to show up 
better than the small, the differences when their probable errors are 
compared are not significant. In fact, the only difference between the 
large and small high schools that could be called at all significant is 
that of college chemistry where the small high schools are ahead with 
an average of 82.9 + .7 as against an average of 80.1 + .4 for the large 
high schools. The probable error of this difference (which is the square 
root of the sum of the squares of the probable errors) is one-third 
the difference, the probable error of the difference being .806 while 
the difference itself is 2.8. The chances are 22 to 1 against a difference 
as great or greater than this coming from random sampling.! In 
the other instances there is not even this difference between the large 
and small high schools, for while algebra, for example, with 85.4 + .5 for 
the large high schools and 82.4 + 1.1 for the small would seem at first 
glance to be quite widely separated, yet this difference is not significant 
when measured by the probable error of the difference. 

The academy in Maine’s secondary system, while it often takes 
the place of the tax-supported high school, usually has some additional 





1 Pearl, Raymond and Miner, J. R.: A Table for Estimating the Probable 
Significance of Statistical Constants. Maine Agricultural Experiment Station, 
Annual Report for 1914, pp. 85-88. 
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means of support. It is ordinarily a partially endowed institution 
which charges a tuition fee. Some of these academies have been 
established for years and are the survivals of the academy system so 
prevalent in the early part of the nineteenth century. They have 
much prestige and a reputation of which they are often proud. Con- 
sequently, it is with some surprise that we consider their averages in 
Table I. When we compare the large high schools and the academies, 
however, we find only two significant differences, algebra and French, 
so that the discrepancy is not so great as it would at first seem to be. 
In general, then, it appears that the size of the high school has practi- 
cally no relation to the average college marks which its pupils later 
recelve. 

The problem of the influence of the high school, or the effect of the 
environmental factor of quality of high school training on the subse- 
quent performance in college life, may be more adequately examined in 
another way. Suppose a high school has poor equipment, poor 
teachers, and a community which does not stimulate scholastic effort. 
It would be expected that the pupils attending such a high school 
would be able to accomplish less in college because of this poor environ- 
ment than would students who came from a well-equipped school with 
stimulating teachers and an environment encouraging scholastic 
effort. Furthermore, all the pupils of the first school would be 
expected to show the effects of these conditions by lower college grades 
and all the pupils of the well-equipped high school would be expected 
to show the influence of their better environment in the high grade 
they obtain in college. Of course, there will be all gradations of qual- 
ity in the high schools, the extreme case being taken for clarity of 
illustration. The problem may now be turned about and viewed from 
the other direction. If one poor pupil comes from a certain high school 
and that pupil’s lack of mental ability is due to the unfavorable environ- 
ment of that high school, when another pupil from that school comes to 
college it would be expected that this second pupil would show the 
influence of the poor environment by low college ranks. Similarly, if 
a pupil of superior mental ability comes from a given school and that 
superior mental ability is due to the quality of the equipment, the 
caliber of the teachers, and the stimulating character of the surround- 
ing population, then another student coming from that high school 
would be expected to show the influence of that favorable environment 
by also showing a better than average mental ability. Now this 
Same reasoning holds good for the larger case where more than two 
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Fig. 1.—The relation between high school environment and college_attainments. ‘| 
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pupils and schools are considered. In other words, if mental ability in 
college is dependent on high school environment then there should be 
a correlation between the mental capacities of the students coming 
from the same high schools. Pupil A should resemble pupil B from 
the same high school more than he does pupil C coming from a different 
high school. If no such correlation exists, then this environmental 
factor of quality of high school cannot be said to play a large part in 
subsequent mental ability of the pupils. That is, the problem is 
forced back one step to the differentiation of the individual homes 
because of the heredity of the parents and also the environment with 
which they surrounded their children. Table II shows these correla- 
tion coefficients for the same divisions, large and small high schools, 
and academies. This effect of high school environment on college 
attainments is shown graphically in Fig. 1. 

It will be seen from this table that all the correlations are small. 
They range from —.009+ .055 for the large high schools in German to 


TaBLE II.—CoRRELATION BETWEEN THE COLLEGE Ranxs oF Two or MorE 
Purits COMING FROM THE SAME Hiau ScHooL, IN THE COLLEGE SuBJECTS 
INDICATED 





| English Chemistry Algebra Analytical | Advanced | Elementary 























geometry French German 
Large high 
schools... .. .017 + .034\— .167+ .037| .374+ .036\—.012+ .044) .055+ .069|—.009+ .055 
Small high 
schools..... — .119+ .066;— .068+ .080 .079+.090| .134+ .093)............ .070+ .109 
Academies... .|— .021+ .056/— .038 + .061/— .039 + .067| .115+ .072;— .044+ .144/— .098+ .090 











+.374 + .036 for the large high schools in algebra. Only two are 
significant—that of chemistry in large high schools (—.167+.037), 
and algebra for the large high schools (+.374+.036). One significant 
correlation coefficient, that of chemistry, is negative (—.167+.037), 
which means that if we had one better-than-average pupil from a 
certain school in this group, instead of the next pupil’s being better- 
than-average, we would be more likely to find him poorer-than-aver- 
age. The significance of these two correlation coefficients seems 
doubtful when the other coefficients in the group are examined. Thus 
for algebra, the small high schools have a correlation coefficient for 
their pupils of .079 and the academies have a correlation coefficient of 
—.039. The correlation coefficients for the chemistry group are 
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—.167, —.068, and —.038. In view of these facts and surprising as it 
may seem to many, the general trend of these results clearly points 
to the conclusion that the quality of the high school does not play more 
than an insignificant part in the subsequent college work which the 
student does. Such a result removes from the field of argument one 
of the environmental factors which, on the basis of pure logic, would 
appear likely to produce the most profound effect on mental ability, 
It shows what critical scrutiny should be given any suggested environ- 
mental factor before it is accepted as a cause of variation in mental] 
ability. 

Our results do not furnish, unfortunately, a criterion to determine 
the influence of another environmental factor, the home. The two 
major variables which appear to account for our previous results! (that 
mental ability in either college or high school is an attribute to the 
individual) are the environment of the home and the inheritance which 
the parents were able to transmit to their children. While these two 
factors are intimately linked together an attempt will be made to 
analyze partially the influence of these variables by a comparison of the 
college attainments of brothers and sisters. 


INFLUENCE OF HEREDITY ON THE PERFORMANCE OF COLLEGE 
STUDENTS 


The original 927 students, graduated between 1913 and 1921, as 
used in the other papers, proved to contain too few brothers and sisters. 
Consequently, all students graduated since 1921 or in college during 
1922-23 are included as our basis for selection. One hundred and fifty 
groups of brothers and brothers, brothers and sisters, or sisters and 
sisters were found. The correlation of the group with college rank was 
determined by means of the Harris correlation for inter- and intra- 
class coefficients. Those familiar with the behavior of the chromo- 
somes will realize that the like inheritance of siblings which tends to 
make them resemble each other in their attributes is only about half 
of the inheritance received. In correlations of such siblings we are 
consequently measuring only a small part of such inheritance. The 
complete influence of inheritance is many times greater than that 
which may be observed in such comparisons. The results for the 
comparison of the college attainments of siblings are given in Table III. 





1 Loc. cit. 
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Taste I1].—CorreLaTION or CoLLEeGE Ranks FoR ALL Pairs or SIBLINGs 








Subject Correlation Number of 

coefficient individuals 
RR ss sccccvcesccsvecssavevsvedenesee’ .152 + .039 279 
EE ncse erected bien hebawevemnNew Ka .156 + .052 159 
Mgebra.. cece ccccccccccccesccesccccces .155 + .065 103 
Analytical geometry...................00-- .221 + .072 79 
i sion e bs nysb heat a wae chess .192 + .104 39 
i CN sec ctdeecebeweesceewd .413 + .114 24 











—_- 


The data in Table III are very limited. In the attempt to divide 
them further into brothers and brothers, sisters and sisters, and 
brothers and sisters it was found that the amount of material became 
so limited as to be subject to very large errors of random sampling. 
The average correlation for the only like-sexed comparison worthy of 
consideration, brothers and brothers, was .187 for the first four sub- 
jects listed in Table III. The average correlation for all siblings is 
.171 for these subjects. These results justify the grouping of all sexes 
of siblings. 

The results show a slight correlation between the mental attain- 
ment in siblings in collegiate work. ‘The probable errors are all large 
in comparison with the correlation coefficients due no doubt to lack 
of numbers. 


TaBLE [V.—Constants INDICATING THE INHERITANCE OF INTELLIGENCE (FROM 








ELDERTON) 
Relatives Corre- Anthesley 
compared lation 
Intelligence (teacher’s esti-|{ Brother and brother| .52 
mate). Brother and sister .49 Pearson 





Sister and sister .50 
Intelligence (relative’s esti-| Parent and offspring 44 | Elderton 
mate). 


Intelligence (quotient)........ Siblings .54 | Gordon analyzed 
by Elderton 
Intelligence (teacher's esti-| { Brother and brother| .49 |) 
mate), Brother and sister| .54 
Sister and sister .50 Drinkwater 
Brother and brother; .49 analyzed by 
Dv ccweteseosenhs Brother and sister .38 Elderton 
Sister and sister 53 || 
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The conclusion that the mental attainment of siblings in college 
work is correlated, falls in line with the results on intelligence tests and 
intelligence estimates. In her competent review, Miss Elderton sum. 
marizes the present status of the problem of the inheritance of intel]j- 


gence. The constants for these summarized results are shown ip 
Table IV. 


The results of Table IV clearly show intelligence to be inherited. 
The equally convincing results of Galton in his studies of genius 
arrived at the same conclusion. Schuster and Elderton using Oxford 
Class lists and Charterhouse School records present convincing data 
supporting the same contention. Terman’s work shows intelligence 
to be a permanent attribute of the individual with heredity as the chief 
cause of the permanence. Thorndike, Rietz, and numerous other 
investigators including the United States army tests all support the 
conclusion that intelligence is inherited. ) 


The first part of this paper showed that the high schools of this 
state did not cause any differentiation in the college ability of their 
students. Certainly this environmental factor of training would be 
considered by most impartial observers as likely to be of major impor- 
tance in the display of subsequent mental ability by the student. But 
it has no effect. Other environmental factors have been studied by 
Heron! and others in the hope of finding one or more such factors 
which would show a consistent influence on intelligence. The search 
was almost in vain. The list includes cleanliness, nutrition, condition 
of glands, tonsils, and teeth, weight and height, parents’ condition as 
to sobriety, and morality, physical and economic status of home. All 
save cleanliness showed no effect and cleanliness may be interpreted 
as due to intelligence. The permanence of intelligence under different 
environmental conditions clearly points to the same conclusion. The 
permanence of heredity under identical environment also favors 
heredity as the cause of intelligence. 


In view of these facts we may inquire if the results on the inheri- 
tance of intelligence are comparable with other known data where 
there is little chance for the environment to play more than an insignif- 
icant part. Here again Miss Elderton has reviewed the literature 
and collected some pertinent data for us. Siblings have correlations 





1 Elderton, Ethel M.: A Summary of the Present Position with Regard to the 
Inheritance of Intelligence. Biometrika, 1923, Vol. XIV, pp. 378-408. 
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for eye color, hair color, and hair curliness ranging between .50 and .62 
with a mean of .54. Only the dye pot and the curling iron have shown 
any pronounced influence on any of these characters and these factors 
ean scarcely be said to form a part of natural environment. 

It may be agreed that intelligence is not physical but only an 
expression of a physiological process. From such a point of view we 
may quote the work of the authors on the inheritance of a strictly 
physiological process, the inheritance in cattle of milk quantity and 
quality. In this work the correlation of the milk-yields of full sisters 
was .55 + .03. The correlation of the butter-fat percentages of these 
full sisters was .46 + .03. The correlation of daughter and dam for 
their milk yields was .50 + .02 and for the butter-fat percentage was 
41+ .02. These results clearly point to the same degree of inheritance 
in the physiological expression of gland secretions as that displayed by 
intelligence. 

But what about the results on the relation of the mental attain- 
ments of college siblings as already given. They are correlated to not 
a third of the correlation which exists between either the teacher’s 
estimate of the intelligence, or of the intelligence estimated by the 
method of the intelligence quotient. Here, then, is the big educational 
problem. When the educational system can be so perfected as to 
develop fully the innate heredity into expression as the mental attain- 
ments shown by college work, then and only then can the system be 
said to be adequate in its method. But, be that as it may, to a limited 
degree it has been shown that brothers are correlated in their mental 
attainments in college. As brothers have only half their inheritance 
common, and since the common inheritance is all that concerns the 
correlation between them, it becomes evident that all the correlations 
between the mental attainments in college as found here in these 
studies could be accounted for by the one variable—heredity. 


SUMMARY 


This paper presents a study of the influence of the high school and 
of heredity on the mental attainments of the student in college. The 
results show that the high school has little or no influence on the per- 
formance of the individual in college. The previous papers have 
furnished data to prove that performance in college is correlated with 
the performance of the individual in high school. Thus we see that 
the largest element in subsequent college performance is the individual’s 
innate capacity and not the environment into which he may have been 
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thrown previously. This point is further emphasized by the demon. 
stration of a slight correlation between the college performances of 
brothers. When it is realized that not more than half the heredity of 
the individual is accounted for by these correlations, it becomes clear 
that heredity is the major, governing, demonstrable force in regulating 
college attainments. The correlations point to further opportunities 
for progress for today in our educational systems, we are not utilizing 


over a third of the possible similarity in intelligence which is known to 
exist between pairs of siblings. 





Nore 
In the previous paper of this series, Age, sex, and the interrelations 
of mental attainments of college students, published in the March issue 
of this Journal, the wrong cut was used on page 203. The cut 
appearing here should have been used. 
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Fic. 1.—Graph showing the average mental attainments of college students ‘in 
chemistry for given grades of ability in other subjects. 


Solid line, English; broken line, 


algebra; dash dot dash line, analytical geometry; dot dot dash line, German. 
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THE RECALL OF OBSERVED MATERIAL 
JOHN A. McGEOCH AND PAUL L. WHITELY 
Washington University 


The learning and recall of such material as nonsense syllables, 
numbers and lists of words as usually carried on, do not resemble 
closely either in content or in method the conditions under which the 
acquisition and recall of experience take place in actual life. This 
paper deals with an experiment designed to measure recall when the 
original learning conditions and the material resemble somewhat more 
closely certain of the conditions of extra-laboratory acquisition. 
Complete resemblance is not claimed, but only a closer approach to 
such conditions. Much of the knowledge of any person has been 
acquired as a result of more or less brief observation, not under formal 
learning conditions, and to be of significance it must be retained for 
more than a few days. The problem of this experiment is to measure 
the recall of briefly observed material after intervals of 30, 60, 90 
and 120 days. 

The material used for observation or “learning” was the Binet 
object-card. This is a rectangular sheet of orange-yellow cardboard, 
33.5 X 40.5cem. Attached to it are two photographs, a red and white 
label, a white button, a one cent postage stamp and a penny. 

Four groups of subjects recalled in the form of a written narrative 
reproduction, in which they were instructed to describe the objects so 
carefully that anyone who had not seen the card would know all about 
what was on it. Four other groups recalled in the form of answers to 
an interrogatory of 50 questions.! With both narrative and interroga- 
tory, each group wrote an immediate recall and a delayed recall, the 
intervals of delay being 30, 60, 90 and 120 days. The subjects were 
college sophomores, about equally divided between the sexes. The 
number of subjects in each group were: 


INTERVAL NARRATIVE INTERROGATORY 
30 39 37 
60 47 41 
90 31 35 
120 26 42 


It will be noted that the conditions are so arranged that the prac- 
tice effect of frequent subsequent recalls by the same group, as in 





1 With few exceptions, the questions were those suggested by Whipple: ‘‘ Man- 
ual of Mental and Physical Tests.’’ Baltimore: Warwick and York, 1915. 
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Dallenbach’s work! on the relation of memory error to time interval] 
is eliminated. Each group in the present experiment recalls but 
twice—immediately after observation and again after either 30, 60, 90 
or 120 days, depending on the group. 

The subjects were instructed, just prior to the observation of the 
object card, that they were to be shown a cardboard with several 
objects attached to it, and that they were to inspect it as carefully as 
possible, because immediately after the observation they would be 
asked to recall as much as possible of what they had observed. They 
were then permitted to observe the card in groups of three for a period 
of 30 seconds. Immediately following the observation, they began 
the writing of the recall, either narrative or interrogatory, depending 
on the group. 

At the expiration of the interval, the subjects were reminded of 
their previous observation of the object-card and instructions for the 
writing of the recall were given. As a check on the possible effect of 
communication between students during the interval, each student 
was asked, at the completion of his recall, to state whether he had 
talked to anyone else regarding the material observed, or whether he 
had given it a review during the interval. The statements indicate 
that the effect of intercommunication and of review may be considered 
negligible. 

The narratives were scored by counting each separate item, right 
or wrong, as one. Thus: “An oval picture of a horse-drawn sleigh” 
would be given a credit of four points.2, Interpretations and indefinite 
statements were not counted. The interrogatories were scored by 
counting the answer to each question, right or wrong, as one. 

We shall deal first with the effect of the time intervals upon correct 
narrative recall. Table I presents the means of the immediate and 





1 Dallenbach, K. M.: The Relation of Memory Error to Time Interval. 
Psychological Review, Vol. XX, 1913, pp. 323-327. 

2 It may be objected that this method of scoring lacks objectivity, inasmuch as 
different scorers might differ in their judgments as to the exact division into 
separate items. In order to test the validity of this objection, both the immediate 
and the delayed recall narratives at 30 and 60 days were scored twice by different 
people—an advanced student in the department and one of the experimenters— 
with the same scoring directions, and the retention curves plotted. The results 
varied only minutely. It would seem that the fact of virtually identical results 
obtained by different scorers working in ignorance of each other’s results, validates 
the objectivity of the method. 











dela 
dev 


TaB 


sat: 
exc 
cas 


en 


be 
to 
of 
1 








al 
ut 
90 


t 


7 








The Recall of Observed Material 421 


delayed recalls at each interval, with their differences and standard 


deviations. 
TapLE I.—Merans or IMMEDIATE AND DELAYED NARRATIVE RECALLS—ITEMS 
RIGHT 
SIGMA OF 
INTERVAL IMMEDIATE MEAN DELAYED MEAN DIFFERENCE DIFFERENCE 
30 58 .6 52.1 6.5 2.8 
60 61.0 44.5 16.5 3.1 
90 60.3 36.3 24.0 3.7 
120 64.7 35.4 29.3 4.6 


If we take a difference which is three times its sigma as indicating 
satisfactory reliability, all of the differences are thoroughly reliable, 
except that between the immediate and the 30 day recalls. In this 
case, at least the direction of the difference is certain. 

Table I gives simply the means of the raw scores. Since four differ- 
ent groups are involved and since each gives a somewhat different 
mean, a curve of retention can not be plotted from the differences 
between the means of the raw scores. It will be convenient, therefore, 
to express retention in terms of the per cent that the delayed recall is 
of the immediate, letting the immediate recall in each case represent 
100 per cent. Table II gives these per cents.! 


TaBLE II.—PeEeR Cent THat DELAYED RECALL IS OF IMMEDIATE 


INTERVAL Per CENT 
0 100.0 
30 88.8 
60 73.0 
90 60.1 
120 54.7 


The same data are presented in Fig. 1 in the form of a curve of 
retention, when retention is measured by recall in the form of a narra- 
tive reproduction. 

The curve of retention falls in very nearly a straight line from 
immediate recall to recall after 90 days, and somewhat more slowly 
from 90 to 120 days thus showing a slight negative acceleration. 
After 120 days, however, only 45.3 per cent of the material reproduced 
on immediate recall has been lost. Direct comparison can not be 
made between this curve and the classical Ebbinghaus retention curve, 
because no measures of recall were taken at the intervals used by 
Ebbinghaus. It is evident, however, that the forgetting of observed 





1 Per cents have been computed throughout from total scores, not from means. 
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material when measured by narrative reproduction is not as great as 
is the forgetting of nonsense syllables, when retention is measured by 
relearning time. Radossawljevitch' found the forgetting of nonsense 
syllables to be almost complete after 120 days. After 30 days prac- 
tically four-fifths of Ebbinghaus’ syllables had been forgotten;? while 
Finkenbinder* and Luh‘ found 55.6 and 47.7 per cent retention, 
respectively, after an interval of only two days. 


100 


Por Cente ; 
$ $ $8 3s 8 


20 


10 





Fig. 1. 


Luh has pointed out the manner in which the retention curve varies 
with the method of measurement and with the degree of original learn- 
ing. He has also suggested that the age of the subjects and the type 
of material may be important factors. In this experiment, the 
material has differed widely from that of most investigators, and this 
difference in material is probably the main factor in producing a 





+ “Das Behalten und Vergessen bei Kindern und Erwachsenen.” Leipzig, 1907. 

2 Uber das Gedachtniss: ‘‘ Untersuchungen zur experimentellen Psychologie.” 
Leipzig, 1885. 

* The Curve of Forgetting. American Journal of Psychology, Vol. XXIV, 1913. 

*“The Conditions of Retention.”” Univ. of Chicago Libraries, 1922, (or 
Psychological Monographs, Vol. 31, 1922). 
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different type of retention curve and a smaller amount of forgetting. 
The objects on the card could be grasped readily as meaningful wholes, 
without the necessity of detailed verbalization attendant upon rote 
learning. An observation time of 30 seconds could hardly permit 
more than partial “learning,” in the sense that each item could hardly 
have been noted more than one or twice. Further, we have measured 
recall by the reproduction method, which Luh lists as being harder 
than recognition, reconstruction or relearning, for nonsense syllables 
at least. One would be led to expect, then, from partial learning by 
adults and the reproduction method, no unusual retention. The 
inference that the material is the differential factor seems sound. 

We shall turn next to the amount of retention when recall is meas- 
ured in terms of answers to 50 questions about the material observed. 
Four different groups of subjects are, of course, involved here. ‘Table 
III shows the means of the immediate and delayed recalls at each 
interval, with their differences and sigmas. 


TaBLE IIJ].—Merans or IMMEDIATE AND DELAYED INTERROGATORY RECALLS— 


Irems Ricut 
SIGMA OF 


INTERVAL IMMEDIATE MEAN DELAYED MEAN DIFFERENCE DIFFERENCE 
30 36.9 36.0 9 2.0 
60 38 .2 36.5 1.7 RS 
90 37.8 33.1 4.7 1.6 
120 38.0 35.5 2.5 1.5 


None of the differences between immediate and delayed recall is 
three times its sigma, although the difference at the 90 day interval 
almost satisfies this criterion. All of the differences indicate a loss 
with the passage of time, but a very small one. The only thing that 
can be inferred from the figures is that the direction of the difference 
is likely to be toward a decrease in retention, but that very little can 
be said regarding the absolute amount, except that it is small. 

Table IV shows the per cents that delayed recalls are of immediate 
recalls, immediate recall thus representing 100 per cent. The same 
data are shown in the form of a retention curve in Fig. 2. 


TaBLeE IV.—PeEeR Cent THat DELAYED RECALL IS OF IMMEDIATE RECALL 


INTERVAL Per Cent 
0 100.0 
30 97.7 
60 95.5 
90 87.8 


120 93.4 
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It is evident that the interrogatory form of recall gives very different 
results from that of the narrative form of recall of the same materia]. 
The groups contain roughly similar numbers of subjects and all are 
of the same academic standing and of similar general training and 
ability. The assumption that we are using equivalent groups seems 
valid. Further, the questions cover the outstanding features of the 
object with some thoroughness. Nevertheless, the narrative form of 
recall yields a steadily falling curve which drops 45.3 per cent after 120 
days. The interrogatory form of recall gives a curve which falls very 
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slowly from immediate recall to recall after 90 days and rises from there 
to 120 days. This rise to 120 days is probably due to chance factors, 
but it is probable at least that the difference between the amounts 
forgotten at 90 and at 120 days, measured by interrogatory recall is 
very slight. 

Thus far we have dealt with the correct recall of observed items. 
In every recall there were at least a few errors. The error scores are, 
however, highly variable and the standard deviations of the differences 
between the error means in immediate and delayed recall indicate low 
reliability. Further, the error curves are too irregular to permit any 
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certain conclusions, save that they are irregular. For this reason 
they are omitted. In general, narrative errors are higher in delayed 
recall after 30 and 60 days than in immediate recall, and less after 90 
and 120 days than in immediate. Interrogatory errors are less after 
30 days than in immediate recall, and increase from that point to one 
40 per cent above immediate recall after 120 days. 

Per cents right are higher in immediate than in delayed recall, but 
the differences are small and do not vary with the time interval. 


CONCLUSIONS 


1. The curve of retention for observed material, measured by 
narrative reproduction, falls in very nearly a straight line from 
immediate recall to recall after 90 days, and somewhat more slowly 
from 90 to 120 days, thus showing a slight negative acceleration. 
After 120 days, 45.3 per cent of the material has been forgotten. 

2. The curve of retention for observed material, measured by 
answers to an interrogatory, falls very slowly from immediate recall 
to recall after 90 days and probably remains roughly constant from 
90 to 120 days. The largest amount forgotten after an interval is 
12.2 per cent after 90 days. 

3. The curve of retention varies with the method of measurement, 
the forgetting being much greater when measured by narrative 
reproduction than it is when measured by answers to an interrogatory. 

4. The curves for both narrative and interrogatory do not resemble 
closely the classical Ebbinghaus curve of forgetting. While no recalls 
have been taken after shorter intervals than 30 days, it is clear that 
the drop in the curve after 30 days is by no means as large as that 
found by investigators using nonsense syllables. It is probable that 
this difference is due to the difference in material. 
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A TExT WITH THE GESTALT VIEWPOINT 


Educational Psychology, by Charles Fox. New York: Harcourt, 
Brace and Company, 1925. Pp. 380. 


It is a worthy book and is well written. Its clear, alluring style is 
accentuated by the splendid make-up. Interest in it is increased by 
the inclusion of some less usual topics and the Gestalt viewpoint of its 
author. As such it rejects, naturally, the theory of association and 
atomism (“ . . . the whole notion of mental elements is artificial and 
invalid. Mental life is an organic unity and all parts are what they 
are by virtue of the living whole.”’). In discussing the neural basis 
of habit the author says, ‘‘ The function of repetition in habit formation 
is, therefore, not to establish paths of low resistance but to attune the 
muscular system to the appropriate configuration, just as a violin 
only yields the best tune after prolonged use by a master musician.” 
In support of this doctrine he draws upon the interesting analogy: 
‘Tt should not be difficult in these days of ‘wireless’ instruments to 
conceive how a transmitter and a receiver can function together with- 
out a fixed pathway of connection. All that is necessary is that one 
should be attuned to the other. Future research must discover the 
physiological mechanism on which such attuning in the animal body 
depends before a satisfactory theory of the neural basis of habit can 
be formulated.” So we have a glimpse at an hypothesis which 
although not in accord with Lloyd Morgan’s canon, is plausible enough 
to those who wish to believe, and in the meantime its proponents hope 
to be excused for the necessity of presenting evidence to support their 
theory. 

Lack of space in these columns prohibits much more than mention 
of the chapters on Suggestion, Aesthetic Appreciation, and Psycho- 
analysis. While he denies the Freudian suggestion that all impulses 
are derived from inherited tendencies to action and condemns practices 
of psycho-analysts of this school, Mr. Fox recognizes some of the prob- 
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lems of sex and offers some sane suggestions for sex education. In 
treating the topic of Intelligence he counters the Spearman doctrine 
that there is a central fund of mental energy, implies that the concept 
of intelligence may be only a hypothetical abstraction, and repeats 
certain conservative suggestions regarding abuses of intelligence tests. 
He offers the suggestion that fatigue should be studied, not from the 
viewpoint of quantity of work done but from that of steadiness of 
work, and the theory is stated that since ‘‘living process is essentially 
rhythmical, anything which bauiks the rhythm is bound to interfere 
with action and make it more difficult.”” This, according to the author 
is fatigue, and mental fatigue is thwarting the rhythm of attention. 
Treatment of other conventional topics completes the book. 

The book is well organized. It is not, however, adapted for use 
as a text in American schools, nor is it suitable for use in elementary 
classes. It is a valuable source for supplementary reading for second 
and advanced courses, as well as for teachers; it merits critical reading 
by properly prepared students. Yes, it is a worthy book which de- 
serves much commendation, but is it not fortunate that such praise 
does not require that one accept the hypotheses and theories that are 
advanced by the author? EpWIN Maurice BaILor. 

Dartmouth College. 





INTIMATE GLIMPSES OF A GREAT PSYCHOLOGIST 


G. Stanley Hall, a Biography of a Mind, by Lorine Pruette. New York: 
D. Appleton and Co., 1926. Pp. 267. 


A biographer may write from two radically different points of 
view. He may keep his eye both on the character about which he is 
writing and on his audience; or he may look only at the subject of 
his study. He may take the attitude that he is to present a character 
to an audience in such a way that each member of the audience, 
by seeing himself mirrored here and there and by being allowed to 
fill in at places with his own imagination, may come to identify himself 
with that great personage. This would demand an analysis of the 
great ‘“‘drives”’ in the life of that personage in terms of “drives” that 
we all feel to a greater or lesser degree. It would demand that the 
thought life and the emotional life of the hero be depicted as not 
utterly different from the experiences we all know, though they might 
be wider and deeper. On the other hand the biographer may wish to 
forget his audience, to “‘let down the bars,” and speak words of 
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praise, adoration, wonder, and mystery. The above descriptions refer 
to limits between which there is a continuous gradation, no Single 
writer probably remaining at any one level for any two consecutive 
pages. In a given volume, however, a writer will establish himself 
at a certain average level on the scale. Persons may differ as to which 
is the high and the low end of this scale for certain purposes. However, 
if the audience does not blame a writer who wishes to gaze in wonder 
upon a hero, forgetting occasionally that he has an audience, then the 
writer should not blame the members of that audience if they find at 
times a mysterious chasm between themselves and the hero because 
the tie that binds has been left out. 

In spite of the difficulty of showing a unity and consistency of 
purpose running through a career so varied and through thoughts 
and deeds so often misunderstood, Miss Pruette has succeeded in 
pointing out this unity. The biography convinces one of the devotion 
of its writer for the famous teacher. In two of the chapters, ‘‘ Cousin- 
ship to Peter Whiffle” and ‘‘ The Religion of a Scientist,’’ we find what 
appear to be class notes taken in Dr. Hall’s lectures. From the first 
chapter in which a boy climbs a high hill to the last where as an old 
man with no hope of a future existence he looks at death, the biog- 
rapher is in imagination close to the giant of her book. 

The author has been sometimes guilty of extravagant and almost 
weird statements, but perhaps such statements should not be too 
severely criticized for we are aware that the biographer, a devoted 
disciple, was at times looking only at her master and forgot that she 
was writing words from which the reader would have to construct his 
own picture. One such statement follows: “He was, like Jesus he so 
revered as a great and good man, a fisher of men. He wanted to save 
souls, when science was hinting very vigorously that they had no souls 
to be saved; he wished to see them go forward to new and finer per- 
sonalities; he sought to make men out of vegetables. And strangest of 
all, he sometimes succeeded.” She seems at times also to enjoy 
enshrouding Hall in mystic garments, as is shown by the following: 
“The category of genius falls outside all scientific terminology; it 
fits into no scheme of classification of human nature. On the now 
familiar bell-shaped curve which purports to represent the distribution 
of ability there is a place for the inferior, the average, and the superior 
man, but there is no place for the genius. Nobody knows what causes 
it. It has never been measured.” In the next paragraph we find this: 
‘Hall had genius and expressed genius, but not primarily in his work. 
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It was his personality which was the chief expression of his genius.” 
All psychologists will agree with the biographer that no one has been 
able to explain what causes a genius, and that no one has been able to 
measure him, but the question is why try to place “inferior, average, 
and superior man’”’ within the realm that may be attacked by science 
and place the genius outside. Probably the question would be an- 
swered if the biographer could establish the fact that science can at 
present explain what causes the inferior, the average, or the superior 
and is able to measure these in all the manifestations of their nature, 
while it is suddenly paralyzed when faced by agenius. Both of these 
quotations indicate that Miss Pruette has adopted the point of view of 
glorifying a great philosopher-psychologist, and has been at times 
indifferent to the interaction between the spirit of the reader and 
a greater spirit—a spirit which was not utterly different and more 
mystic, but greater. 

In general it is felt that those who had the unique privilege of 
studying under G. Stanley Hall will enjoy and profit by the book in a 
way that those who knew him afar-off can not do. For psychologists 
and educators who were not students of this great teacher there is 
little in the book that will be of absorbing interest. 

VERNON A. JONES. 
Teachers College, Columbia University. 
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ANNOUNCEMENTS 


The Bureau of Educational Research of Ohio State University jg 
interested in a carefully controlled study of the relative advantages of 
grade and departmental teaching. For that reason any information 
concerning school systems which are about to change either from 
grade to departmental teaching or from departmental to grade teaching 
will be appreciated. If school superintendents or principals of schools 
in which such changes are contemplated will communicate the fact 
it will assist materially in setting up the problem effectively. Cen, 
munications should be addressed to Dr. B. R. Buckingham, Director, 


Bureau of Educational Research, Ohio State University, Columbus, 
Ohio. 


The decision of the American Historical Association to launch a 
campaign for a million dollars endowment has interest for all persons 
and organizations working in the general field of the social sciences. 
The progressive recognition by the historians of the intimate relation 
of the present with the past is re-making that subject. It is tying 
it up with the other social sciences with results favorable to all the 
members of the group. Among the objects to which historical stu- 
dents are now devoting a great deal of time are many that deal with 
the actual development of society. The work of the historian in this 
field is fundamental for the practical observer of the actual operation 
of social processes. 

The purpose for which the Association proposes to raise an endow- 
ment is expressed in the general encouragement of research and in 
assistance given toward the publication of books that are useful to 
scholarship and at the same time are not of such a popular character 
that they can expect to be taken by the ordinary commercial publisher 
without a guarantee by the author. For these forms of assistance 
small sums may well be advanced by such a fund. The aid of young 
research students is well provided for by a number of fellowships, some 
by recently established foundations and many more by institutions 
of learning. But there is little done for the mature scholar, past the 
fellowship age, who often has to depend on his own small professorial 
salary for the funds to enable him to go on with his work. It is 
believed that the success of this effort will have a beneficial influence 
in a wide field of similar study. 
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